The Binomial distribution assumes exactly two possible outcomes: Success or Failure. But what if there are distinct, mutually exclusive outcomes? For example, rolling a 6-sided die, or categorizing voters into multiple political parties.
This is where the Multinomial Distribution comes in. It is a multivariate generalization of the Binomial distribution.
Parameters
- : Total number of independent trials.
- : Number of mutually exclusive categories.
- : Probability of outcome occurring in a single trial, where .
- : Random variable representing the number of times outcome occurs.
- : Observed count for category , where .
- Genetics: Counting the number of offspring with different blood types (A, B, AB, O) given parental genetics.
- Text Analysis: Modeling the frequency of words in a document (the "Bag of Words" model in NLP often assumes a multinomial distribution over the vocabulary).
- Surveying: Asking 100 people their favorite season (Spring, Summer, Fall, Winter).
Core Properties
Unlike previous distributions, the Multinomial is a joint distribution of multiple random variables .
- Marginal Mean ():
- Marginal Variance ():
- Covariance (): (for )
Why is the covariance negative? Because the total number of trials is fixed. If category A happens more often, it must come at the expense of category B happening less often. They "compete" for the fixed spots.
Marginal Distributions
If you only care about one specific category and lump all the others into "everything else," the marginal distribution of is simply a Binomial Distribution: .
It's crucial to remember that the Multinomial distribution doesn't output a single number, but a vector of counts: . Any valid outcome must sum to the total number of trials . This strict constraint is what creates the negative covariance between different outcomes—a success in category A mathematically forces a 'not-A' in all other categories for that specific trial.
Advanced Practice
Example 1: Loaded Die
You roll a loaded 4-sided die 10 times. The probabilities of rolling 1, 2, 3, and 4 are respectively. What is the probability of rolling exactly two 1s, three 2s, one 3, and four 4s?
Example 2: Election Polling
In a local election, 50% of voters favor Candidate A, 30% favor Candidate B, and 20% are undecided. If you survey 5 random voters, what is the probability that Candidate A and Candidate B tie in your sample, with exactly 2 votes each, and 1 voter being undecided?