Updating Your Beliefs: Conditional Probability

How does knowing it's cloudy change the chance of rain? If you know a person likes sci-fi movies, does that change the chance they like video games? Probability is not just about fixed numbers; it's about information.

DDefinition
Conditional Probability

The probability of AA given that BB has already occurred is: P(AB)=P(AB)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)}

We are essentially shrinking the "universe" from the full Ω\Omega to just the subset where BB happened. P(B)P(B) must be greater than zero for this to be defined.

Intuition
The Restricted Universe

Think of the sample space as a classroom. If I ask "What is the probability of picking someone taller than 6ft?", my universe is the whole class. If I ask "What is the probability they are taller than 6ft given they play basketball?", I've thrown out everyone who doesn't play basketball. My "universe" is now much smaller (the basketball players), and the probability of being tall is likely much higher within that group.

Multiplication Rule

By rearranging the definition, we get the rule for finding the probability that both AA and BB happen:

P(AB)=P(B)P(AB)=P(A)P(BA)P(A \cap B) = P(B)P(A \mid B) = P(A)P(B \mid A)
Independence: The Absence of Information

Events AA and BB are Independent if knowing BB happened tells you absolutely nothing about AA.

!Common pitfall
Common Confusion: Independent vs. Mutually Exclusive

This is the most common trap for students!

  • Independent: P(AB)=P(A)P(A \mid B) = P(A). One event has no influence on the other. (e.g., Tossing a coin twice).
  • Mutually Exclusive (Disjoint): P(AB)=0P(A \cap B) = 0. If AA happens, BB cannot happen. They are highly dependent! (e.g., Being in London and New York at the same time). If you know you are in London, the probability you are in New York is 0.

Defining Independence Mathematically

Two events AA and BB are independent if and only if:

P(AB)=P(A)P(B)P(A \cap B) = P(A)P(B)
Breaking Down Complexity: Law of Total Probability

Calculating a probability directly is often hard. The Law of Total Probability (LOTP) lets us break the world into distinct "scenarios" (a partition) and solve each one individually.

Intuition
The Partition Strategy

If you have a set of mutually exclusive events B1,B2,,BnB_1, B_2, \ldots, B_n that cover the whole sample space (a partition), then for any event AA: P(A)=i=1nP(ABi)=i=1nP(ABi)P(Bi)P(A) = \sum_{i=1}^n P(A \cap B_i) = \sum_{i=1}^n P(A \mid B_i)P(B_i)

Email Filter Logic

Using LOTP to find the total probability an email is 'Spam' based on keywords.

0.20.90.10.80.050.95Incoming EmailContains keyword "Free"p = 0.2Is actually Spamp = 0.9Is actually Not Spamp = 0.1Does Not Contain keyword "Free"p = 0.8Is actually Spamp = 0.05Is actually Not Spamp = 0.95
Contains keyword "Free" → Is actually Spam
Path probability: 0.18
Contains keyword "Free" → Is actually Not Spam
Path probability: 0.02
Does Not Contain keyword "Free" → Is actually Spam
Path probability: 0.04
Does Not Contain keyword "Free" → Is actually Not Spam
Path probability: 0.76
Bayes' Theorem: The Logic of Science

Bayes' Theorem is the mathematical engine behind medical diagnosis, spam filters, and self-driving cars. It allows us to "reverse" conditional probabilities: going from P(EffectCause)P(\text{Effect} \mid \text{Cause}) to P(CauseEffect)P(\text{Cause} \mid \text{Effect}).

P(CauseEffect)=P(EffectCause)P(Cause)P(Effect)P(\text{Cause} \mid \text{Effect}) = \frac{P(\text{Effect} \mid \text{Cause}) \cdot P(\text{Cause})}{P(\text{Effect})}
Intuition
Bayesian Thinking
  • Prior (P(C)P(C)): What did we believe before the evidence?
  • Likelihood (P(EC)P(E \mid C)): How well does the evidence match our theory?
  • Evidence (P(E)P(E)): How likely is the evidence in general?
  • Posterior (P(CE)P(C \mid E)): What should we believe now?

The Rare Disease Paradox

medium

A disease affects 1% of the population. A test is 95% accurate (it catches 95% of diseases and only misidentifies 5% of healthy people). You test positive. Should you panic?