Sometimes, we know very little about our data—we might only know its average () or its variance (). In these "uncertain" cases, we use inequalities to set a strict "worst-case scenario" on probability.
For any non-negative random variable : for any .
Intuition: If the average salary in a company is $50,000, no more than 1/4 of the employees can make $200,000 or more. If they did, it would drag the average higher than $50k!
Chebyshev's Inequality: The Power of Variance
Chebyshev's inequality is more powerful because it uses both the mean () and the standard deviation ().
Intuition: This tells us that for any distribution, at least 75% of the data MUST fall within 2 standard deviations () of the mean, and at least 89% must fall within 3 standard deviations (). Unlike the Normal distribution's 95/99% rules, this works even for weird, non-Normal data.
The LLN is the "mathematical engine" of the world. It states that as the sample size goes to infinity, the Sample Mean () will converge exactly to the True Expected Value ().
If you play one hand of blackjack, anything could happen—you could win or lose. But the house plays millions of hands. Because of the LLN, the house's average profit across those millions of hands is guaranteed to be the exact mathematical expectation (usually a 1-2% edge). This is why a casino doesn't need "luck"—it just needs the Law of Large Numbers and a lot of players.
Convergence of Sample Mean
Watch how the running average of rolling a 6-sided die eventually settles at the theoretical mean of 3.5. Early randomness is 'washed out' by the sheer volume of trials.
The CLT is perhaps the most profound theorem in all of mathematics. It states that if you take the sum (or average) of independent and identically distributed (i.i.d.) random variables, the result will always look like a Normal Distribution as grows, regardless of the original distribution's shape!
It doesn't matter what distribution you start with—dice rolls, coin flips, or even weird, skewed distributions. When you add many of them together, the randomness "evens out" into the familiar Bell Curve. This is why the Normal distribution is the "default" shape of nature.
The Mathematical Statement
If are i.i.d. with mean and variance , then:
CLT Simulation Pipeline
How any 'raw' distribution eventually becomes Gaussian through the process of averaging many samples.
Have you ever seen a board where balls drop through a grid of pins and collect in bins at the bottom? Each ball makes a "random" choice (left or right) at each pin. The final bin it lands in is the sum of all those random choices. This is why the balls always form a perfect Bell Curve at the bottom—it's a physical demonstration of the CLT!