How does knowing it's cloudy change the chance of rain? If you know a person likes sci-fi movies, does that change the chance they like video games? Probability is not just about fixed numbers; it's about information.
DDefinition
Conditional Probability
The probability of A given that B has already occurred is:
P(A∣B)=P(B)P(A∩B)
We are essentially shrinking the "universe" from the full Ω to just the subset where B happened. P(B) must be greater than zero for this to be defined.
✦Intuition
The Restricted Universe
Think of the sample space as a classroom. If I ask "What is the probability of picking someone taller than 6ft?", my universe is the whole class. If I ask "What is the probability they are taller than 6ft given they play basketball?", I've thrown out everyone who doesn't play basketball. My "universe" is now much smaller (the basketball players), and the probability of being tall is likely much higher within that group.
Multiplication Rule
By rearranging the definition, we get the rule for finding the probability that bothA and B happen:
P(A∩B)=P(B)P(A∣B)=P(A)P(B∣A)
Independence: The Absence of Information
Events A and B are Independent if knowing B happened tells you absolutely nothing about A.
!Common pitfall
Common Confusion: Independent vs. Mutually Exclusive
This is the most common trap for students!
Independent:P(A∣B)=P(A). One event has no influence on the other. (e.g., Tossing a coin twice).
Mutually Exclusive (Disjoint):P(A∩B)=0. If A happens, Bcannot happen. They are highly dependent! (e.g., Being in London and New York at the same time). If you know you are in London, the probability you are in New York is 0.
Defining Independence Mathematically
Two events A and B are independent if and only if:
P(A∩B)=P(A)P(B)
Breaking Down Complexity: Law of Total Probability
Calculating a probability directly is often hard. The Law of Total Probability (LOTP) lets us break the world into distinct "scenarios" (a partition) and solve each one individually.
✦Intuition
The Partition Strategy
If you have a set of mutually exclusive events B1,B2,…,Bn that cover the whole sample space (a partition), then for any event A:
P(A)=∑i=1nP(A∩Bi)=∑i=1nP(A∣Bi)P(Bi)
Email Filter Logic
Using LOTP to find the total probability an email is 'Spam' based on keywords.
Contains keyword "Free" → Is actually Spam
Path probability: 0.18
Contains keyword "Free" → Is actually Not Spam
Path probability: 0.02
Does Not Contain keyword "Free" → Is actually Spam
Path probability: 0.04
Does Not Contain keyword "Free" → Is actually Not Spam
Path probability: 0.76
Bayes' Theorem: The Logic of Science
Bayes' Theorem is the mathematical engine behind medical diagnosis, spam filters, and self-driving cars. It allows us to "reverse" conditional probabilities: going from P(Effect∣Cause) to P(Cause∣Effect).
Prior (P(C)): What did we believe before the evidence?
Likelihood (P(E∣C)): How well does the evidence match our theory?
Evidence (P(E)): How likely is the evidence in general?
Posterior (P(C∣E)): What should we believe now?
The Rare Disease Paradox
medium
A disease affects 1% of the population. A test is 95% accurate (it catches 95% of diseases and only misidentifies 5% of healthy people). You test positive. Should you panic?
Let D = Disease, T = Test Positive. 1. Prior:P(D)=0.01 (so P(Dc)=0.99). 2. Likelihoods:P(T∣D)=0.95 and P(T∣Dc)=0.05. 3. Total Probability of Testing Positive (P(T)): P(T)=P(T∣D)P(D)+P(T∣Dc)P(Dc) P(T)=(0.95⋅0.01)+(0.05⋅0.99)=0.0095+0.0495=0.059.
Interpretation: Even though the test is '95% accurate', there is only a 16.1% chance you have the disease. Why? Because the disease is so rare that most positive tests are actually 'False Positives' from the 99% of the population that is healthy.