Processing math: 9%

Wishart

Ususal parameterisation

Probability distribution function

The Wishart distribution of dimension K is defined over K×K positive definite matrices. Its parameters are V, its scale matrix, and ν>K1, its degrees of freedom.

WK(AV,ν)=|A|(νK1)/2exp(12Tr(V1A))2νK/2|V|ν/2ΓK(ν2)lnWK(AV,ν)=νK12lndet

where \Gamma_K is the multivariate gamma function[wiki], \psi_K is the multivariate digamma function[wiki] and \psi_1 is the trigamma function[wiki].

This distribution has a mode only if \nu \geqslant K + 1:

Maximum likelihood estimators

Let (\mathbf{A}_n) a set of observed realisations from a Gamma distribution.

\hat{\mathbf{V}} \mid (\mathbf{A}_n), \nu = \frac{1}{\nu}\overline{\mathbf{A}}
\hat{\nu} \mid (\mathbf{A}_n) solution of: K \ln \hat{\nu} - \psi_K\left(\frac{\hat{\nu}}{2}\right) = K \ln 2 + \ln\det\overline{\mathbf{A}} - \overline{\ln \det \mathbf{A}}
\hat{\mathbf{V}} \mid (\mathbf{A}_n) = \hat{\mathbf{V}} \mid (\mathbf{A}_n), \hat{\nu}

where

There is no closed form solution for \hat{\nu}, but an approximate solution can be found by numerical optimisation.

I need to check my math for \nu

Conjugate prior

We list here the distributions that can be used as conjugate prior for the parameters of an univariate Normal distribution:

\mathbf{V} \mid \nu Inverse-Wishart \mathcal{W}^{-1}

Update equations can be found in the Conjugate prior article.

Kullback-Leibler divergence

The KL-divergence can be written as

where H is the cross-entropy. We have

Specialisation  
Exponential \mathrm{Exp}(\lambda) = \mathcal{W}\left(\frac{1}{2\lambda}, 2\right)
Chi-squared \chi^2(\nu) = \mathcal{G}\left(1, \nu\right)
Gamma \mathcal{G}(\alpha, \beta) = \mathcal{W}\left(\frac{1}{2\beta}, 2\alpha\right)
Power  
Inverse-Wishart \mathbf{X} \sim \mathcal{W}(\mathbf{V}, \nu) \Rightarrow \mathbf{X}^{-1} \sim \mathcal{W}^{-1}\left(\mathbf{V}^{-1}, \nu\right)

“Normal precision matrix conjugate” parameterisation

Another parameterisation, which may feel more natural when using the Wishart distribution as a prior for the precision matrix of a multivariate Gaussian distribution, uses the expected matrix instead of the scale matrix:

This distribution has a mode only if \nu \geqslant K + 1:

Maximum likelihood estimators

Let (\mathbf{A}_n) a set of observed realisations from a Gamma distribution.

\hat{\boldsymbol\Lambda} \mid (\mathbf{A}_n) = \overline{\mathbf{A}}
\hat{\nu} \mid (\mathbf{A}_n) solution of: K \ln \hat{\nu} - \psi_K\left(\frac{\hat{\nu}}{2}\right) = K \ln 2 + \ln\det\overline{\mathbf{A}} - \overline{\ln \det \mathbf{A}}

where

There is no closed form solution for \hat{\nu}, but an approximate solution can be found by numerical optimisation.

I need to check my math for \nu

Kullback-Leibler divergence

The KL divergence becomes


Created by Yaël Balbastre on 10 April 2018. Last edited on 10 April 2018.