In this post, I’ll be covering the basics of Multivariate Normal Distributions, with special emphasis on deriving the conditional and marginal distributions.
Given a random variable under the usual Gauss-Markov assumptions, with with , and independent samples , we can define vector with . We can see from the covariance structure of the errors that all off-diagonal elements are 0, indicating that our samples are independent with equal variances.
Marginal Distributions
Now assume that , where , and is an arbitrary covariance matrix, where we cannot assume independence. If is non-singular, we can decompose as
and, using the inversion lemmas from Blockwise Matrix Inversion, define its inverse as
From the properties of transformations of Normal random variables, we can define the marginal of as
where such that
so that .
Conditional Distributions
Showing the conditional distribution is a bit long-winded, so bear with me. We are interested in finding the distribution of , which we can explicitly represent as
Writing out the joint density for , we have the following
Partitioning this expression up into the individual terms related to and , the exponent becomes
Expanding this quadratic form out, we see that we end up with
Let us, for simpliciy set and . Substituting back in our definitions of , and , and and using the Sherman-Morrison-Woodbury definition for , we have the following
which, by distribution of across the first term and splitting the second term into its two sums, we have
We can pull out forms to the left and to the right and, after applying a transpose, have
Plugging the above back into our exponential term in our original density function, we see that we have a product of two exponential terms
and
where the first term is the marginal density of and the second is the conditional density of with conditional mean and conditional variance .
While long and drawn out, the formulas show that the conditional distribution of any subset of Normal random variables, given another subset, is also a Normal distribution, with conditional mean and variance defined by functions of the means and covariances of the original random vector.