Inverse-Wishart

Ususal parameterisation

Probability distribution function

The Inverse-Wishart distribution of dimension $K$ is defined over $K \times K$ positive definite matrices. Its parameters are $\boldsymbol{\Psi}$, its scale matrix, and $\nu > K - 1$, its degrees of freedom. The mean exists only for $\nu > K + 1$.

where $\Gamma_K$ is the multivariate gamma function[wiki], $\psi_K$ is the multivariate digamma function[wiki] and $\psi_1$ is the trigamma function[wiki].

This distribution always has a mode:

Maximum likelihood estimators

Let $(\mathbf{S}_n)$ a set of observed realisations from a Gamma distribution.

$\hat{\boldsymbol{\Psi}} \mid (\mathbf{S}_n), \nu$ $= \nu\left[\overline{\mathbf{S}^{-1}}\right]^{-1}$
$\hat{\nu} \mid (\mathbf{S}_n)$ solution of: $K \ln \hat{\nu} - \psi_K\left(\frac{\hat{\nu}}{2}\right) = K \ln 2 + \ln\det\overline{\mathbf{S}^{-1}} + \overline{\ln \det \mathbf{S}}$
$\hat{\boldsymbol{\Psi}} \mid (\mathbf{S}_n)$ $= \hat{\boldsymbol{\Psi}} \mid (\mathbf{S}_n), \hat{\nu}$

where

There is no closed form solution for $\hat{\nu}$, but an approximate solution can be found by numerical optimisation.

I need to check my math for $\nu$

Conjugate prior

We list here the distributions that can be used as conjugate prior for the parameters of an univariate Normal distribution:

$\boldsymbol{\Psi} \mid \nu$ Wishart $\mathcal{W}$

Update equations can be found in the Conjugate prior article.

Kullback-Leibler divergence

The KL-divergence can be written as

where $H$ is the cross-entropy. We have

Needs to be checked

“Normal covariance matrix conjugate” parameterisation

I am not sure that this is the best prameterisation yet.

Another parameterisation, which may feel more natural when using the Wishart distribution as a prior for the precision matrix of a multivariate Gaussian distribution, uses the expected matrix instead of the scale matrix. This parameterisation only makes sense if $\nu > K + 1$:

This distribution always has a mode:

Maximum likelihood estimators

Let $(\mathbf{S}_n)$ a set of observed realisations from a Gamma distribution.

$\hat{\boldsymbol\Sigma} \mid (\mathbf{S}_n)$ $= \frac{\nu}{\nu-K-1}\left[\overline{\mathbf{S}^{-1}}\right]^{-1}$
$\hat{\nu} \mid (\mathbf{S}_n)$ solution of: $K \ln \hat{\nu} - \psi_K\left(\frac{\hat{\nu}}{2}\right) = K \ln 2 + \ln\det\overline{\mathbf{S}} - \overline{\ln \det \mathbf{S}}$
$\hat{\boldsymbol{\Sigma}} \mid (\mathbf{S}_n)$ $= \hat{\boldsymbol{\Sigma}} \mid (\mathbf{S}_n), \hat{\nu}$

where

There is no closed form solution for $\hat{\nu}$, but an approximate solution can be found by numerical optimisation.

I need to check my math for $\nu$

Kullback-Leibler divergence

The KL divergence becomes

TODO


Created by Yaël Balbastre on 10 April 2018. Last edited on 10 April 2018.