112 Conditional Expectation
• Conditional independence is symmetric, that is
Y ⊥ X |Z ⇐⇒ X ⊥ Y |Z
• Random variables Y and X are independent given Z, if and only if
E(g(X)h(Y ) |Z) = E( g(X) |Z) E(h(Y ) |Z) (3.23)
for all choices of bounded (Borel measurable) functions g, h : R → R. Thus, if Y ⊥ X |Z, we
have
E(Y X |Z) = E( Y |Z) E( X |Z)
B: Conditional Probability Density Function
If X, Y and Z are jointly continuous with density f (x, y, z), then random variables Y and X are
said to be conditionally independent given Z, written X ⊥ Y |Z, if and only if, for all x, y and z with
f
Z
(z) > 0
f(x, y |z) = f (x |z)f(y |z)
where functions f( ·|z) are conditional probability density functions. If the random variables are jointly
discrete with probability mass function f(x, y, z) = P(X = x, Y = y, Z = z), the formulas are same
as the continuous case, with f(·|·) representing conditional probability mass function. For example
f(y |x, z) = P ( Y = y |X = x, Z = z)
The other useful ways to define conditional independence are
X ⊥ Y |Z ⇐⇒ f (x |y, z) = f(x |z) ⇐⇒ f(y |x, z) = f(y |z)
These forms are directly related to the widely used notion of Markov property: If we interpret Y as the
“future,” X as the “past,” and Z as the “present,” Y ⊥ X
|
Z says that, given the present the future is
independent of the past; this is known as Markov property.
Remark: X and Y are conditionally independent given Z if and only if, given any value of Z, the
probability distribution of X is the same for all values of Y and the probability distribution of Y is the
same for all values of X.
• For conditional PDF and PMF, we have
f(x, y|z) = f (x|z)f( y|z) ⇐⇒ f(y |x, z) = f (y |z)
However, if Y ⊥ X |Z, we have [f(x, y|z) → E( XY |Z)]
E(XY |Z) = E(X |Z) E( Y |Z)
and
E(Y |X, Z) = E( Y |Z)
Note that E(Y |X, Z) is not affected by X, for E( Y |X, Z) = E( Y |Z) is a function of Z only.
In particular,
E(Y |X, Z = z) = E(Y |Z = z)
is constant over X, the information of X is dropped out.