Subsections


Iterative computation of the inverse Gram matrix

In obtaining sparsity in eqs. (78) and (80), we need the inverse Gram matrix of the $ \cal {BV}$ set. In the following the elements of the $ \cal {BV}$ set are indexed from 1 to t. Using the matrix inversion formula, eq. (182), the addition of a new element is done sequentially. This is well known, commonly used in the Kalman filter algorithm. We consider the new element at the end (last row and column) of matrix Kt + 1. The matrix Kt + 1 is decomposed:

Kt + 1 = $\displaystyle \begin{bmatrix}{\boldsymbol { K } }_t & {\boldsymbol { k } }_{t+1} \\  {\boldsymbol { k } }_{t+1}^T & k^*_{t+1} \end{bmatrix}$ (188)

Assuming Kt-1 known and applying the matrix inversion lemma for Kt + 1:

\begin{displaymath}\begin{split}{\boldsymbol { K } }_{t+1}^{-1} &= \begin{bmatri...
...\gamma_{t+1}^{-1} & \gamma_{t+1}^{-1} \end{bmatrix} \end{split}\end{displaymath} (189)

where $ \gamma_{t+1}^{}$ = k*t + 1 - kt + 1TKt-1kt + 1 is the squared distance of the last feature vector from the linear span of all previous ones (see Section 3.1). Using notations Kt-1kt + 1 = $ \hat{{\boldsymbol { e } }}_{t+1}^{}$ from eq. (62), Kt-1 = Qt, and Kt + 1-1 = Qt + 1 we have the recursion:

Qt + 1 = $\displaystyle \begin{bmatrix}{\boldsymbol { Q } }_t + \gamma_{t+1}^{-1}\hat{{\b...
...{t+1}^{-1} \hat{{\boldsymbol { e } }}_{t+1}^T & \gamma_{t+1}^{-1} \end{bmatrix}$ (190)

and in a more compact matrix notation:

Qt + 1 = Qt + $\displaystyle \gamma^{-1}_{t+1}$($\displaystyle \hat{{\boldsymbol { e } }}_{t+1}^{}$ - et + 1)($\displaystyle \hat{{\boldsymbol { e } }}_{t+1}^{}$ - et + 1)T (191)

where et + 1 is the t + 1-th unit vector. With this recursion equation all matrix inversion is eliminated (this result is general for block matrices, such implementation, together with an interpretation of the parameters has been also made in [10]). The introduction of the rule $ \gamma_{t+1}^{}$ > 0 guarantees non-singularity of the Gram matrix (see Fig. 3.1).


Computing determinants

The block-diagonal decomposition of the Gram matrix from eq. (188) allows us to have a recursive expression for the determinant. Using eq. (184), we have

|Kt + 1| = |Kt|| kt + 1* - kt + 1TKt-1kt + 1| = |Kt|  $\displaystyle \gamma_{t+1}^{}$ (192)

where $ \gamma_{t+1}^{}$ is the squared distance of the new input from the subspace spanned by all previous inputs.


Updates for the Cholesky factorisation

For numerical stability we can use the Cholesky-factorisation of the inverse Gram matrix Q. Using the lower-triangular matrix R with the corresponding indices, and the identity Q = RTR, we have the update for the Cholesky-decomposition

Rt + 1 = $\displaystyle \begin{pmatrix}{\boldsymbol { R } }_t & 0\\  -\gamma_{t+1}^{-1/2}\hat{{\boldsymbol { e } }}_{t+1}^T & \gamma_{t+1}^{-1/2} \end{pmatrix}$ (193)

that is a computationally very inexpensive operation, without additional operations provided that the quantities $ \gamma_{t+1}^{}$ and et + 1 are already computed.

In Chapter 3 the diagonal elements of the inverse Gram matrix are used in establishing the score for an element of the $ \cal {BV}$ set, and the columns of the same Gram matrix are used in updating the GP elements. Fixing the index of the column to l, we have the l-th diagonal element and the columns expressed as

\begin{displaymath}\begin{split}\gamma_{l}&=(r^*)^{2}+\tilde{{\boldsymbol { r } ...
...-l}^T\tilde{{\boldsymbol { r } }}_{l} \end{bmatrix} \end{split}\end{displaymath} (194)

where we used the decomposition of Rt + 1 along the l-th column

Rdatat + 1 = $\displaystyle \begin{bmatrix}{\boldsymbol { R } }_{l} & {\boldsymbol { 0 } }_l ...
...{{\boldsymbol { r } }}_{t-l} & \tilde{{\boldsymbol { R } }}_{t-l} \end{bmatrix}$    

and the update of the Cholesky decomposition, when removing the l-th column is written as

Rt\l = $\displaystyle \begin{bmatrix}{\boldsymbol { R } }_{l} & {\boldsymbol { 0 } }_{l...
...mbol { r } }^*}\right) & U^{-1}\tilde{{\boldsymbol { Q } }}_{t-l} \end{bmatrix}$ (195)

with U being the Cholesky decomposition of IN - l + $ {\frac{\tilde{{\boldsymbol { q } }}_{t-l}\tilde{{\boldsymbol { q } }}_{t-l}^{T}}{(q^*)^{2}}}$. This is always positive definite and there is a fast computation for U (in matlab one can compute it with U=cholupdate(I,qN/q) with corresponding quantities).