The real world rarely involves just one variable in isolation. To understand a patient's health, a doctor looks at both Blood Pressure () and Cholesterol (). To predict a stock's behavior, an analyst looks at the Asset Price () and its Volatility ().
When variables interact, we use a Joint Distribution to describe their combined behavior.
For two continuous variables, the Joint PDF () is like a mountain range on a map. The height of the mountain at any point shows how likely that specific combination is. The volume under the mountain for a certain region is the probability of falling in that area:
Independence in Higher Dimensions
Two random variables and are independent if and only if their joint distribution is simply the product of their individual (marginal) distributions:
If you have a joint distribution of Height and Weight, but you only care about Height, you compute the Marginal Distribution. You "integrate out" (or sum up) all the information about Weight to see only the Height distribution.
| Weight Class (Y) \ Height Class (X) | Short | Avg | Tall | P(Weight Class (Y)) |
|---|---|---|---|---|
| Under | 0.15 | 0.05 | 0.01 | 0.21 |
| Avg | 0.05 | 0.4 | 0.05 | 0.5 |
| Over | 0.01 | 0.05 | 0.23 | 0.29 |
| P(Height Class (X)) | 0.21 | 0.5 | 0.29 | 1 |
Total probability = 1
In the table above, the probability of being Tall and Overweight is . To find the marginal probability of being Tall (regardless of weight), we sum the 'Tall' column: .
How do we measure if two variables move together? If goes up, does usually go up too?
- Positive Covariance: When is above its mean, tends to be above its mean too (e.g., Height and Weight).
- Negative Covariance: When goes up, goes down (e.g., Time spent gaming and Exam scores).
- Zero Covariance: No linear relationship exists between the two.
Because Covariance is hard to interpret (its units are 'Height Weight'), we use Pearson Correlation (), which scales everything to a perfect range between and .
Correlation Strength
- : Perfect positive linear relationship.
- : Perfect negative linear relationship.
- : No linear relationship (but watch out!).