Introduction to the Binomial Distribution

The Binomial Distribution models the number of successes in a fixed number of independent Bernoulli trials. If you flip a coin nn times, and each flip has a probability pp of coming up heads, the total number of heads follows a Binomial distribution.

P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^k (1-p)^{n-k}

Parameters

  • nn: The fixed number of independent trials.
  • pp: The probability of success on any given single trial.
  • XX: The random variable representing the total number of successes across all nn trials.
  • kk: The specific number of successes we want to find the probability for (where 0kn0 \le k \le n).

Core Properties

  • Mean (E[X]E[X]): npnp
  • Variance (Var(X)Var(X)): np(1p)np(1-p)
  • Support: k{0,1,2,,n}k \in \{0, 1, 2, \dots, n\}
EExample
Real-World Examples
  • Quality Control: Counting the number of defective widgets in a batch of 100 randomly sampled widgets (assuming independence).
  • A/B Testing: Counting the number of users who click a button out of 1000 visitors.
  • Medicine: Number of patients who recover from a disease out of nn treated, given a recovery probability pp.

Binomial Distribution (n=20, p=0.3)

0000000.000820.027840.130460.191680.1144100.0308120.0039
Intuition
Why the Combination Term?

The formula for the Binomial distribution contains the binomial coefficient (nk)\binom{n}{k}. Why? Because if we want kk successes in nn trials, those successes could happen in many different orders. The probability of any specific sequence of kk successes and nkn-k failures is pk(1p)nkp^k(1-p)^{n-k}. Since there are (nk)\binom{n}{k} such distinct sequences, and each is mutually exclusive, we add them up!

Advanced Practice

Example 1: The Biased Coin

medium

Suppose you have a biased coin that lands on heads 70% of the time. If you flip this coin 8 times, what is the probability that you get exactly 6 heads?

Example 2: Quality Assurance Limits

hard

A factory produces lightbulbs with a 5% defect rate. In a random sample of 10 bulbs, what is the probability that at most 1 bulb is defective?

The Range Rule of Thumb

In many distributions, including the Binomial distribution (especially when np5np \ge 5 and n(1p)5n(1-p) \ge 5), the vast majority of outcomes will fall within 2 standard deviations of the mean. We use the Range Rule of Thumb to identify "unusual" values.

  • Maximum usual value: μ+2σ\mu + 2\sigma
  • Minimum usual value: μ2σ\mu - 2\sigma

If a value falls outside this range, it is considered statistically unusual.

Example 3: Spotting the Unusual

medium

A standard fair coin is flipped 100 times, and it lands on heads 65 times. Is getting 65 heads considered an unusual result for a fair coin?

Relationships to Other Distributions

The Binomial distribution sits at the center of a family of discrete distributions. Changing its core assumptions (independence, two outcomes, fixed nn) leads to other well-known distributions.

Discrete Distribution Relationships

How the Binomial distribution relates to other key models.

Sum n trialsExpand to k categoriesDependent trialsRandomize nLimit as n→∞Bernoullin = 1 trialBinomialn trials, 2 outcomesMultinomial>2 outcomesHypergeometricWithout replacementGeometricWait for 1st successPoissonn → ∞, p → 0

1. From Bernoulli to Binomial

A Binomial random variable is simply the sum of nn independent and identically distributed (i.i.d.) Bernoulli random variables. If n=1n=1, the Binomial distribution reduces exactly to the Bernoulli distribution.

2. Binomial vs. Hypergeometric

The Binomial distribution assumes trials are independent (sampling with replacement). If you sample without replacement from a finite population, the probability of success changes with each draw. This scenario is modeled by the Hypergeometric Distribution.

3. Binomial vs. Multinomial

The Binomial distribution deals with exactly two outcomes (Success or Failure). If each trial can result in one of kk distinct categories (like rolling a 6-sided die), the distribution of the counts of each category follows a Multinomial Distribution.

4. Binomial vs. Geometric

While Binomial counts the number of successes in a fixed number of trials, the Geometric Distribution counts the number of trials needed to get exactly one success. Both rely on independent Bernoulli trials, but they flip what is fixed and what is random.

5. Binomial vs. Poisson

If nn gets extremely large and pp gets extremely small, while the mean np=λnp = \lambda remains constant, the Binomial distribution converges to the Poisson Distribution. This is useful for modeling rare events.