Subsections


KL-optimal parameter reduction

We want to find the constrained GP with parameters ($ \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$,$ \hat{{\boldsymbol { C } }}_{t+1}^{}$) with the last elements all having zero values (see Section 3.3) that minimises the KL-divergence

\begin{displaymath}\begin{split}2&{\ensuremath{\mathrm{KL}}}(\hat{{\ensuremath{{...
...oldsymbol { K } }_{t+1}^{-1}\right)^{-1}\right\vert \end{split}\end{displaymath} (196)

and we suppose the GP parameters ($ \alpha$t + 1,Ct + 1) and the $ \cal {BV}$ set are given (we know Kt + 1 and Kt). In the following we use Kt + 1-1 = Qt + 1 and the decomposition of the GP parameters as presented in Chapter 3, with Fig 3.3 repeated in Fig D.1.

Figure: Grouping of the GP parameters (Fig 3.3 repeated).
\includegraphics[]{decomp.eps}

The differentiation with respect to parameters $ \hat{\alpha}_{1}^{}$,...,$ \hat{\alpha}_{t}^{}$ leads to the system of equations that is easily written in matrix form as

\begin{displaymath}\begin{split}\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsy...
...{t+1} - {\boldsymbol { \alpha } }_{t+1} \right) &=0 \end{split}\end{displaymath} (197)

where It is the identity matrix and 0t is the column vector of length t with zero elements. In the second line the matrix multiplication has been performed and we used the decomposition

$\displaystyle \left(\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right.$Qt + 1 + Ct + 1$\displaystyle \left.\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { B } }& {\boldsymbol { a } }\\  {\boldsymbol { a } }^T & b \end{bmatrix}$ (198)

Finally, using the decomposition of the vector $ \alpha$t + 1 from Fig. 3.3, we have

$\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ = $\displaystyle \alpha$(r) + $\displaystyle \alpha^{*}_{}$$\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$         with         $\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$ = B-1a    

and $ \tilde{{\boldsymbol { e } }}_{t+1}^{}$ is obtained from the matrix inversion lemma for block matrices from eq. (182). Using the matrix inversion lemma for (Qt + 1 + Ct + 1)-1 from eq. (198) we have:

Qt + 1 + Ct + 1 = $\displaystyle \begin{bmatrix}\left( {\boldsymbol { B } }-\frac{{\boldsymbol { a...
...  -{\boldsymbol { a } }^T{\boldsymbol { B } }^{-1}\delta & \delta \end{bmatrix}$         with         $\displaystyle \delta$ = $\displaystyle \left(\vphantom{b - {\boldsymbol { a } }^T{\boldsymbol { B } }{\boldsymbol { a } }}\right.$b - aTBa$\displaystyle \left.\vphantom{b - {\boldsymbol { a } }^T{\boldsymbol { B } }{\boldsymbol { a } }}\right)^{-1}_{}$ (199)

and using the correspondence $ \delta$ = q* + c* and Q* + C* = - B-1a$ \delta$ read from eq. (199), we have

$\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$ = - $\displaystyle {\frac{1}{q^* + c^*}}$$\displaystyle \left(\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right.$Q* + C*$\displaystyle \left.\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right)$ (200)

and replacing it into the expression for the reduced mean parameters, we have

$\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ = $\displaystyle \alpha^{(r)}_{}$ - $\displaystyle {\frac{\alpha^*}{c^* + q^*}}$$\displaystyle \left(\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right.$Q* + C*$\displaystyle \left.\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right)$ (201)

$ \qedsymbol$

Before differentiating the KL-divergence with respect to $ \hat{{\boldsymbol { C } }}_{t+1}^{}$, we simplify the terms that include $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ in eq. (196). Firstly we write the constraints for the last row and column of $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ using the extension matrix [It  0t] as

$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t \end{bmatrix}^{T}_{}$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{(r)}$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t \end{bmatrix}$ (202)

where $ \hat{{\boldsymbol { C } }}_{t+1}^{(r)}$ is a matrix with t rows and columns, and in the following we will use $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ instead of $ \hat{{\boldsymbol { C } }}_{t+1}^{(r)}$. Permuting the elements in the trace term of eq. (196) leads to


    tr$\displaystyle \left[\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym...
...d{bmatrix}({\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1})^{-1} }\right.$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^{T}_{}$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}$(Ct + 1 + Qt + 1)-1$\displaystyle \left.\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym...
...d{bmatrix}({\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1})^{-1} }\right]$  
  $\textstyle =$ $\displaystyle \ensuremath{\mathrm{tr}}\left[\hat{{\boldsymbol { C } }}_{t+1}
\b...
...{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^T
\right]$ (203)

where the additive term - Ct + 1(Ct + 1 + Qt + 1)-1 is ignored since it will not contribute to the result of the differentiation. Ignoring also the term not depending on $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ in the determinant, and using the replacement of $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ from eq. (202) we simplify the log-determinant

ln$\displaystyle \left\vert\vphantom{\left(\begin{bmatrix}{\boldsymbol { I } }_t &...
... {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}\right) }\right.$$\displaystyle \left(\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym...
... } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}}\right.$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^{T}_{}$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}$ + Qt + 1$\displaystyle \left.\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym...
... } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}}\right)$$\displaystyle \left.\vphantom{\left(\begin{bmatrix}{\boldsymbol { I } }_t & {\b...
...oldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}\right) }\right\vert$ = ln$\displaystyle \left\vert\vphantom{
\begin{bmatrix}
\hat{{\boldsymbol { C } }}_{...
...{\boldsymbol { Q } }^*\\
{\boldsymbol { Q } }^{*T} & q^*
\end{bmatrix}}\right.$$\displaystyle \begin{bmatrix}
\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol {...
...{q^*} & {\boldsymbol { Q } }^*\\
{\boldsymbol { Q } }^{*T} & q^*
\end{bmatrix}$$\displaystyle \left.\vphantom{
\begin{bmatrix}
\hat{{\boldsymbol { C } }}_{t+1}...
...ldsymbol { Q } }^*\\
{\boldsymbol { Q } }^{*T} & q^*
\end{bmatrix}}\right\vert$  
  = ln$\displaystyle \left\vert\vphantom{
\hat{{\boldsymbol { C } }}_{t+1} + {\boldsym...
...T}}{q^*}
- \frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}
}\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt + $\displaystyle {\frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$$\displaystyle \left.\vphantom{
\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol ...
...q^*}
- \frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}
}\right\vert$ + ln q*  
       
  = ln$\displaystyle \left\vert\vphantom{
\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t
}\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$\displaystyle \left.\vphantom{
\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t
}\right\vert$ + ln q* (204)

where we used the decomposition into block-diagonal matrices (first line) and the expression for the determinants of block-diagonal matrices from eq (184).

The differentiation of the KL-distance with respect to $ \hat{{\boldsymbol { C } }}_{t+1}^{}$ is the addition of differentiating eqs. (203) and (204):

$\displaystyle \left(\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } } \end{bmatrix}$$\displaystyle \left(\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right.$Qt + 1 + Ct + 1$\displaystyle \left.\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right)^{-1}_{}$$\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } } \end{bmatrix}^{T}_{}$ (205)

We apply the matrix inversion lemma to the RHS similarly to the case of eq (204) and retaining only the upper-left part leads to

$\displaystyle \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$\displaystyle \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right)^{-1}_{}$ = $\displaystyle \left(\vphantom{{\boldsymbol { C } }^{(r)} + {\boldsymbol { Q } }...
...t({\boldsymbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}} {q^*+c^*} }\right.$C(r) + Qt + $\displaystyle {\frac{\displaystyle{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{\displaystyle\left({\boldsymbol { C } }^*+ {\boldsymbol { ...
...ight)\left({\boldsymbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}}{q^*+c^*}}$$\displaystyle \left.\vphantom{{\boldsymbol { C } }^{(r)} + {\boldsymbol { Q } }...
...symbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}} {q^*+c^*} }\right)^{-1}_{}$ (206)

and the reduced covariance parameter is

$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ = C(r) + $\displaystyle {\frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T}{q^* + c^*}}$ (207)

$ \qedsymbol$

ARRAY(0x896d3ec)ARRAY(0x896d3ec)


Computing the KL-distance

We are assessing the error made when pruning the GP by evaluating the KL-divergence from eq. (196) between the process with ($ \alpha$t + 1,Ct + 1) and the pruned one with ($ \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$,$ \hat{{\boldsymbol { C } }}_{t+1}^{}$) from the previous section. We start by writing the pruning equations in function of t + 1-dimensional vectors: in the following we will use Q* $ \doteq$ [Q*Tq*]T and C* $ \doteq$ [C*Tc*]T and the pruning equations are

\begin{displaymath}\begin{split}\hat{{\boldsymbol { \alpha } }}_{t+1} &= {\bolds...
...oldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T \end{split}\end{displaymath} (208)

and it is easy to check that the updates will result in the last row and column being all zeros. In computing the KL-divergence we will use the identities from the matrix algebra:

$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$Ct + 1 + Qt + 1$\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$$\displaystyle \left(\vphantom{{\boldsymbol { C } }^* + {\boldsymbol { Q } }^* }\right.$C* + Q*$\displaystyle \left.\vphantom{{\boldsymbol { C } }^* + {\boldsymbol { Q } }^* }\right)$ = et + 1     and     Kt + 1Q* = et + 1    

Based on the first identity, the term containing the mean is

($\displaystyle \alpha$t + 1 - $\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$)$\displaystyle \left(\vphantom{{\boldsymbol { C } }_{t+1} + {\boldsymbol { K } }_{t+1}^{-1} }\right.$Ct + 1 + Kt + 1-1$\displaystyle \left.\vphantom{{\boldsymbol { C } }_{t+1} + {\boldsymbol { K } }_{t+1}^{-1} }\right)^{-1}_{}$($\displaystyle \alpha$t + 1 - $\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$) = $\displaystyle {\frac{\alpha^{*2}}{q^* + c^*}}$ (209)

The logarithm of the determinants is transformed, using the determinants of the block-diagonal matrices, in eq. (184):

\begin{displaymath}\begin{split}\left\vert \hat{{\boldsymbol { C } }}_{t+1} + {\...
...*+{\boldsymbol { C } }^*\right)^T \right\vert\; q^* \end{split}\end{displaymath} (210)

and using a similar decomposition for the denominator we have

\begin{displaymath}\begin{split}\left\vert {\boldsymbol { C } }_{t+1} + {\boldsy...
...ymbol { C } }^*\right)^T \right\vert \; (q^* + c^*) \end{split}\end{displaymath} (211)

and the logarithm of the ratio has the simple expression as

ln$\displaystyle \left\vert\vphantom{ \left(\hat{{\boldsymbol { C } }}_{t+1} + {\b...
...ft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1} }\right.$$\displaystyle \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t+1}}\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt + 1$\displaystyle \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t+1}}\right)$$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$Ct + 1 + Qt + 1$\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$$\displaystyle \left.\vphantom{ \left(\hat{{\boldsymbol { C } }}_{t+1} + {\bolds...
...\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1} }\right\vert$ = ln$\displaystyle {\frac{q^*}{q^* + c^*}}$ (212)

Finally, using the invariance of the trace of a product with respect to circular permutation of its elements, the trace term is:


    tr$\displaystyle \left[\vphantom{ \left( \frac{{\boldsymbol { Q } }^*{\boldsymbol ...
...eft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1}}\right.$$\displaystyle \left(\vphantom{ \frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }...
...ft({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T}
{q^* + c^*} }\right.$$\displaystyle {\frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T}{q^* + c^*}}$$\displaystyle \left.\vphantom{ \frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }...
...ft({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T}
{q^* + c^*} }\right)$$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$Ct + 1 + Qt + 1$\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$$\displaystyle \left.\vphantom{ \left( \frac{{\boldsymbol { Q } }^*{\boldsymbol ...
...eft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1}}\right]$  
  $\textstyle =$ $\displaystyle \frac{1}{q^*}{\boldsymbol { Q } }^{*T}\left({\boldsymbol { C } }_...
... }_{t+1}\right)^{-1}
\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)$  
  $\textstyle =$ $\displaystyle \frac{1}{q^*}{\boldsymbol { Q } }^{*T}\left[{\boldsymbol { K } }_...
...}_{t+1}\right)^{-1}
{\boldsymbol { K } }_{t+1} \right]{\boldsymbol { Q } }^* -1$  
  $\textstyle =$ $\displaystyle 1
- \frac{1}{q^*}{\boldsymbol { e } }_{t+1}^T
\left({\boldsymbol ...
...}{\boldsymbol { e } }_{t+1}
- 1
= - \frac{\displaystyle s^*}{\displaystyle q^*}$ (213)

where s* is the last diagonal element of the matrix (Ct + 1-1 + Kt + 1)-1. Summing up eqs (209), (212), and (213), we have the minimum KL-distance

2KL($\displaystyle \hat{{\ensuremath{{\cal{GP}}}}}_{t+1}^{}$|$\displaystyle \cal {GP}$t + 1) = $\displaystyle {\frac{\alpha^{*2}}{q^* + c^*}}$ - $\displaystyle {\frac{s^*}{q^*}}$ + ln$\displaystyle \left(\vphantom{ 1+ \frac{c^*}{q^*}}\right.$1 + $\displaystyle {\frac{c^*}{q^*}}$$\displaystyle \left.\vphantom{ 1+ \frac{c^*}{q^*}}\right)$ (214)


Updates for St + 1 = (Ct + 1-1 + Kt + 1)-1

Matrix inversion is a sensitive issue and we are trying to avoid it. In computing the score for a given $ \cal {BV}$ in the previous section, eq. (214), we need the diagonal element of the matrix S = (C-1 + K)-1. In this section we sketch an iterative update rule for the matrix S, and an update when the KL-optimal removal of the last $ \cal {BV}$ element is performed.

First we establish the update rules for the inverse of matrix Ct + 1. By using the matrix inversion lemma and the update from eq. (57), the matrix C-1 is

Ct + 1-1 = $\displaystyle \begin{bmatrix}{\boldsymbol { C } }_t^{-1} & -{\boldsymbol { k } ...
...l { k } }_{t+1}^T{\boldsymbol { C } }_t{\boldsymbol { k } }_{t+1} \end{bmatrix}$ (215)

then we combine the above relation with the block-diagonal decomposition of the kernel matrix, and observing that the t×1 column vector is zero, we have

$\displaystyle \left(\vphantom{{\boldsymbol { C } }_{t+1}^{-1} + {\boldsymbol { K } }_{t+1} }\right.$Ct + 1-1 + Kt + 1$\displaystyle \left.\vphantom{{\boldsymbol { C } }_{t+1}^{-1} + {\boldsymbol { K } }_{t+1} }\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}\left({\boldsymbol { C } }_t^{-1}+{\boldsymbol { K...
...)^{-1} & {\boldsymbol { 0 } } \\  {\boldsymbol { 0 } }^T & a^{-1} \end{bmatrix}$         where         a = (r(t + 1))-1 + kt + 1TCtkt + 1 + k*    

and this shows that the update for the matrix St + 1 is particularly simple: we only need to add a value on the last diagonal element.

When removing a $ \cal {BV}$ however, the resulting matrix will not be diagonal any more. To have an update quadratic in the size of S, we use the matrix inversion lemma

St + 1 = Qt + 1 - Qt + 1$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$Ct + 1 + Qt + 1$\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$Qt + 1    

and after the pruning we are looking for the t×t matrix $ \hat{{\boldsymbol { S } }}_{t+1}^{}$ = $ \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1}^{-1}+{\boldsymbol { K } }_t}\right.$$ \hat{{\boldsymbol { C } }}_{t+1}^{-1}$ + Kt$ \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1}^{-1}+{\boldsymbol { K } }_t}\right)^{-1}_{}$. We can obtain this by using eq. (206): the pruned $ \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right.$$ \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$ \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right)^{-1}_{}$ is the matrix obtained by cutting the last row and column from $ \left(\vphantom{{\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1}}\right.$Ct + 1 + Qt + 1$ \left.\vphantom{{\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1}}\right)^{-1}_{}$. The computation of the updated matrix $ \hat{{\boldsymbol { S } }}_{t+1}^{}$ has thus three steps:

  1. compute

    $\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$Ct + 1 + Qt + 1$\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ = Kt + 1 - Kt + 1St + 1Kt + 1    

  2. compute the reduced matrix $ \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right.$$ \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$ \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right)^{-1}_{}$ by trimming, use eq. (206)
  3. compute the updated $ \hat{{\boldsymbol { S } }}_{t+1}^{}$ using

    $\displaystyle \hat{{\boldsymbol { S } }}_{t+1}^{}$ = Qt - Qt$\displaystyle \left(\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right.$$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Qt$\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right)^{-1}_{}$Qt