Skip to content

Linear algebra¤

This module contains a function marginalized_log_likelihood which computes the log likelihood of a microlensing model marginalized over the linear flux parameters \(\beta\equiv(F_s,F_b)^\intercal\). This function is intended to be used a as replacement for the standard likelihood function when optimizing the likelihood or doing MCMC.

The marginalization is standard practice when fitting microlensing events because it reduces the number of parameters in the model by the number of linear parameters in the model. For light curves consisting of multiple independent observations (for example, those observed by different observatories) this is a necessary step because there could be dozens of linear parameters in the model. The way this is traditionally done in microlensing is by solving for the linear parameters conditional on fixed values of the nonlinear parameters using linear least squares. This procedure is justified in some circumstances but it is not valid if the data covariance matrix is dense (for example, if we are using a Gaussian Process to model correlated noise in the light curves or stellar variability of the source star). Below, I show how to analytically marginalize over the linear parameters in the general case.

The observed flux \(\mathbf f\) can be written as a linear model \begin{equation} \mathbf f = \mathbf M\,\boldsymbol\beta \end{equation} where \(\mathbf M\) is the design matrix which depends on some nonlinear parameters \(\boldsymbol\theta\):

\[\begin{equation} \mathbf{M}\equiv \begin{pmatrix} \tilde{A}(t_1;\boldsymbol\theta) & 1 \\ \tilde{A}(t_2;\boldsymbol\theta) & 1 \\ \vdots & \vdots \\ \tilde{A}(t_{N};\boldsymbol\theta) & 1 \end{pmatrix} \end{equation}\]

Assuming that we place a Gaussian prior on the linear parameters \(\boldsymbol\beta\) with mean \(\boldsymbol\mu\) and covariance \(\boldsymbol\Lambda\), one can show that the marginal likelihood \(\ln\int p(\mathbf f|\boldsymbol\theta)\,p(\boldsymbol\beta|\boldsymbol\theta)d\boldsymbol\beta\) is given by

\[\begin{equation} \mathrm{LL}(\boldsymbol\theta)=-\frac{1}{2}\left( \mathbf{f}-\mathbf{M}\boldsymbol{\mu}\right)^\intercal\left(\mathbf{C} + \mathbf{M}\boldsymbol{\Lambda}\mathbf{M}^\intercal\right)^{-1} \left( \mathbf{f}-\mathbf{M}\boldsymbol{\mu}\right) - \frac{1}{2}\ln\left| \mathbf{C} + \mathbf{M}\boldsymbol{\Lambda}\mathbf{M}^\intercal\right| \end{equation}\]

To compute the inverse and the determinant of the covariance matrix of marginalized likelihood we can use the matrix inversion lemma (see Appendix A3 of R&W):

\[\begin{align} & \left(\mathbf{C}+\mathbf{M} \boldsymbol{\Lambda} \mathbf{M}^{\intercal}\right)^{-1}=\mathbf{C}^{-1}-\mathbf{C}^{-1} \mathbf{M}\left(\boldsymbol{\Lambda}^{-1}+\mathbf{M}^{\intercal} \mathbf{C}^{-1} \mathbf{M}\right)^{-1} \mathbf{M}^{\intercal} \mathbf{C}^{-1} \\ & \ln\left|\mathbf{C}+\mathbf{M} \mathbf{\Lambda} \mathbf{M}^{\intercal}\right|=\ln|\mathbf{C}| +\ln|\boldsymbol{\Lambda}| + \ln\left|\boldsymbol{\Lambda}^{-1}+\mathbf{M}^{\intercal} \mathbf{C}^{-1} \mathbf{M}\right| \end{align}\]

However, if we marginalize over the linear parameters, how can we obtain their values in order to, for example, produce a plot of the model fit? The answer, from the paper linked above, is that the distribution over \(\boldsymbol\beta\) conditional on particular values of the nonlinear parameters \(\boldsymbol\theta\) is a Gaussian \(\mathcal{N}(\boldsymbol\beta;\mathbf a,\mathbf A)\) where

\[\begin{align} &\mathbf{A}^{-1}=\boldsymbol\Lambda^{-1}+\mathbf{M}^{\intercal}\mathbf{C}^{-1}\mathbf{M} \\ &\mathbf a = \mathbf A\left(\boldsymbol\Lambda^{-1}\boldsymbol\mu+\mathbf{M}^{\intercal}\mathbf{C}^{-1}\mathbf{f}\right) \end{align}\]

If we had posterior samples of the nonlinear parameters \(\boldsymbol\theta\) we could could use this distribution to generate samples of the linear parameters.

In marginalized_log_likelihood I use the standard least squares solution for \(\boldsymbol\beta\) if the data covariance matrix is diagonal because it's faster than the matrix operations in \(\mathrm{LL}(boldsymbol\theta)\). Otherwise I use the full analytic marginalization.

caustics.linalg ¤

marginalized_log_likelihood(A_list, fobs_list, C_inv_list, dense_covariance = False, Lam_sd = 10000.0) ¤

Compute the log-likelihood for a microlensing light curve marginalized over over the linear flux parameters \(\boldsymbol\beta\equiv(F_s, F_b)^\intercal\). The function takes a list of magnification vectors, a list of observed fluxes and a list of inverse data covariance matrices as an input, one element of the list represents and independent set of observations, for instance light curves from different observatories. The total log-likelihood is then a sum of the log-likelihoods for each light curve.

If dense_covariance is False, the inverse data covariance matrices are assumed to be 1D vectors containing the elements on the diagonal. In that case, the most efficient way to marginalize over the linear parameters in the likelihood is to solve the linear least squares problem conditional on fixed values of the nonlinear parameters (which determine the magnification). If dense_covariance is True, the inverse covariance matrices are assumed to be dense matrices. In that case we have to compute the full marginal likelihood. This increases the computational cost of the likelihood evaluation.

Parameters:

Name Type Description Default
A_list list[array_like]

List of 1D magnification arrays, one per independent observation.

required
fobs_list list[array_like]

List of 1D observed flux arrays.

required
C_inv_list list[array_like]

List of nverse data covariance matrices. If dense_covarianceis False, this is a 1D vector containing the diagonal elements of the covariance matrix. If dense_covariance is True, this is a dense matrix.

required
dense_covariance bool

Flag indicating whether C_inv is dense or diagonal. Defaults to False.

False
Lam_sd float

Standard deviation of the Gaussian prior on the linear flux parameters. The prior is assumed to be a zero mean independent Gaussian with standard deviation Lam_sd for both linear parameters. Defaults to 1e04.

10000.0

Returns:

Name Type Description
tuple

If dense_covariance is False, returns a tuple

\((\boldsymbol\beta, \mathrm{LL}(\boldsymbol\theta))\) containing the

least-squares solution for the linear parameters and the log-likelihood.

Otherwise, the first element of the tuple is 0.