Multivariate Random Variables

The Interactive World: Joint Distributions

The real world rarely involves just one variable in isolation. To understand a patient's health, a doctor looks at both Blood Pressure ( $X$ ) and Cholesterol ( $Y$ ). To predict a stock's behavior, an analyst looks at the Asset Price ( $X$ ) and its Volatility ( $Y$ ).

When variables interact, we use a Joint Distribution to describe their combined behavior.

✦Intuition

The 3D Probability Surface

For two continuous variables, the Joint PDF ( $f(x,y)$ ) is like a mountain range on a map. The height of the mountain at any point $(x,y)$ shows how likely that specific combination is. The volume under the mountain for a certain region is the probability of falling in that area: $P(X, Y \in A) = \iint_A f(x,y) dx dy$

Independence in Higher Dimensions

Two random variables $X$ and $Y$ are independent if and only if their joint distribution is simply the product of their individual (marginal) distributions:

f(x,y) = f_X(x) \cdot f_Y(y)

Marginal Distributions: Zooming In

If you have a joint distribution of Height and Weight, but you only care about Height, you compute the Marginal Distribution. You "integrate out" (or sum up) all the information about Weight to see only the Height distribution.

f_X(x) = \int_{-\infty}^{\infty} f(x,y) dy \quad \text{and} \quad f_Y(y) = \int_{-\infty}^{\infty} f(x,y) dx

Weight Class (Y) \ Height Class (X)	Short	Avg	Tall	P(Weight Class (Y))
Under	0.15	0.05	0.01	0.21
Avg	0.05	0.4	0.05	0.5
Over	0.01	0.05	0.23	0.29
P(Height Class (X))	0.21	0.5	0.29	1

Total probability = 1

EExample

Reading the Table

In the table above, the probability of being Tall and Overweight is $P(X=\text{Tall}, Y=\text{Over}) = 0.23$ . To find the marginal probability of being Tall (regardless of weight), we sum the 'Tall' column: $0.01 + 0.05 + 0.23 = 0.29$ .

Covariance & Correlation: Moving Together

How do we measure if two variables move together? If $X$ goes up, does $Y$ usually go up too?

✦Intuition

The Sign of Covariance

Positive Covariance: When $X$ is above its mean, $Y$ tends to be above its mean too (e.g., Height and Weight).
Negative Covariance: When $X$ goes up, $Y$ goes down (e.g., Time spent gaming and Exam scores).
Zero Covariance: No linear relationship exists between the two.

Cov(X,Y) = E[(X-\mu_X)(Y-\mu_Y)] = E[XY] - E[X]E[Y]

Because Covariance is hard to interpret (its units are 'Height $\times$ Weight'), we use Pearson Correlation ( $\rho$ ), which scales everything to a perfect range between $-1$ and $1$ .

\rho_{X,Y} = \frac{Cov(X,Y)}{\sigma_X \sigma_Y}

Correlation Strength

$\rho = 1$ : Perfect positive linear relationship.
$\rho = -1$ : Perfect negative linear relationship.
$\rho = 0$ : No linear relationship (but watch out!).

1.If

X

and

Y

are independent, their correlation is guaranteed to be 0. This is easy to prove: since

E[XY] = E[X]E[Y]

for independent variables,

Cov(X,Y) = E[X]E[Y] - E[X]E[Y] = 0

2.The Trap: If correlation is 0, it does NOT necessarily mean the variables are independent! Correlation only measures linear relationships.

3.Counter-Example: Let

X

be any symmetric distribution around 0, and

Y = X^2

Y

is perfectly dependent on

X

—if you know

X

, you know

Y

exactly. Yet, their correlation will be 0 because the relationship is a parabola, not a line.

∎

← Previous

The Cauchy Distribution

Course Progression

15 of 25

Transformations & Generating Functions