Probability: The Language of Uncertainty

Probability is the mathematical framework for quantifying uncertainty. Whether you're a data scientist predicting user churn, a gambler calculating pot odds, or a physicist studying quantum states, you're using the same fundamental rules established centuries ago.

At its heart, probability is about counting: how many ways can something happen versus how many ways it could have happened?

Set Theory & Sample Spaces

Before we can calculate the odds of anything, we need to rigorously define the "universe" of all possible outcomes. This universe is called the Sample Space.

DDefinition 1.1
Sample Space (Ω)

The Sample Space, denoted by Ω\Omega (the Greek letter Omega), is the set of all possible outcomes of an experiment. An Event EE is any subset of the sample space (EΩE \subseteq \Omega).

Understanding Ω\Omega is critical. If you define your sample space incorrectly, every calculation that follows will be wrong.

EExample
Real-World Sample Spaces
  • Tossing a Coin: Ω={H,T}\Omega = \{H, T\} (where HH = Heads, TT = Tails).
  • Rolling a Six-Sided Die: Ω={1,2,3,4,5,6}\Omega = \{1, 2, 3, 4, 5, 6\}.
  • Weather Tomorrow: Ω={Sunny, Rainy, Cloudy, Snowy}\Omega = \{\text{Sunny, Rainy, Cloudy, Snowy}\}.
  • Stock Price: Ω=[0,)\Omega = [0, \infty). Because a stock price can be any decimal value (like $150.23), we use an interval rather than a list.

Set Operations: The Building Blocks

Since events are just sets of outcomes, we use set theory to combine them:

  • Union (ABA \cup B): Either AA happens, BB happens, or both happen. Think of it as the logical "OR".
  • Intersection (ABA \cap B): Both AA and BB must happen at the same time. Think of it as the logical "AND".
  • Complement (AcA^c or AA'): The event that AA does not happen.
Has Netflix (A)Has Disney+ (B)Only NetflixBothOnly Disney+ΩNeither
The Three Axioms of Kolmogorov

In 1933, Andrey Kolmogorov established three rules that all probability measures must follow. If these aren't met, the system isn't "probability."

  1. Non-negativity: P(E)0P(E) \ge 0 for any event EE. You can't have a -10% chance of rain.
  2. Normalization: P(Ω)=1P(\Omega) = 1. The probability that something in the universe happens is 100%.
  3. Countable Additivity: If events A1,A2,A_1, A_2, \ldots are mutually exclusive (meaning they can't happen at the same time, so AiAj=A_i \cap A_j = \emptyset), then the probability of any of them happening is the sum of their individual probabilities: P(i=1Ai)=i=1P(Ai)P(\bigcup_{i=1}^\infty A_i) = \sum_{i=1}^\infty P(A_i)
1.By definition, an event EE and its complement EcE^c are mutually exclusive (EEc=E \cap E^c = \emptyset).
2.Together, they cover the entire sample space: EEc=ΩE \cup E^c = \Omega.
3.Using Axiom 3 (Additivity): P(EEc)=P(E)+P(Ec)P(E \cup E^c) = P(E) + P(E^c).
4.Using Axiom 2 (Normalization): P(EEc)=P(Ω)=1P(E \cup E^c) = P(\Omega) = 1.
5.Therefore, P(E)+P(Ec)=1P(E) + P(E^c) = 1, which gives us the shortcut: P(E)=1P(Ec)P(E) = 1 - P(E^c). This is often much easier than calculating P(E)P(E) directly.
Combinatorics: The Art of Counting

How many ways can you arrange a 10-song Spotify playlist? What are the odds of cracking a 4-digit PIN? When the sample space is too large to list manually, we use Combinatorics.

The Fundamental Counting Principle

If there are nn ways to do one thing and mm ways to do another, there are n×mn \times m ways to do both. This "multiplication of choices" is the engine behind all counting formulas.

(nk)=n!k!(nk)!\binom{n}{k} = \frac{n!}{k!(n-k)!}
MethodFormulaOrder Matters?Replacement?Intuition
Permutationsnkn^kYesYesDigital PIN Codes or Passwords
Permutations (Strict)n!(nk)!\frac{n!}{(n-k)!}YesNoRace Finishers (1st, 2nd, 3rd)
Combinations(nk)\binom{n}{k}NoNoLottery Numbers / Poker Hands
Stars & Bars(n+k1k)\binom{n+k-1}{k}NoYesDistributing identical items (e.g. cookies) to distinct buckets (e.g. kids)
EExample
The Birthday Paradox

In a room of just 23 people, there is a 50.7% chance that at least two people share a birthday. This is counter-intuitive because our brains think about the chance of someone sharing our birthday (which is low), rather than any two people sharing any birthday. There are (232)=253\binom{23}{2} = 253 possible pairs of people in that room, giving many opportunities for a match!

The Password Problem

medium

A hacker knows your 4-digit PIN uses only distinct digits (no repeats) and definitely starts with a '7'. How many possible PINs must they test?

← Previous
Beginning
Course Progression
1 of 25
Next →
Conditional Probability & Independence