Sufficiency (statistics)
In statistics, one often considers a family of probability distributions for a random variable X (and X is often a vector whose components are scalar-valued random variables, frequently independent) parameterized by a scalar- or vector-valued parameter, which let us call θ. A quantity T(X) that depends on the (observable) random variable X but not on the (unobservable) parameter θ is called a statistic. Sir Ronald Fisher tried to make precise the intuitive idea that a statistic may capture all of the information in X that is relevant to the estimation of θ. A statistic that does that is called a sufficient statistic. The precise definition is this: A statistic T(X) is sufficient for θ precisely if the conditional probability distribution of the data X given the statistic T(X) does not depend on θ.For example:
- If X1, ...., Xn are independent Bernoulli-distributed random variables with expected value p, then the sum X1 + ... + Xn is a sufficient statistic for p.
- If X1,....,Xn are independent and uniformly distributed on the interval [0,θ], then max(X1,....,Xn ) is sufficient for θ.






