Linear algebra¤
This module contains a function marginalized_log_likelihood
which computes
the log likelihood of a microlensing model marginalized over the linear flux
parameters \(\beta\equiv(F_s,F_b)^\intercal\). This function is intended to be
used a as replacement for the standard likelihood function when optimizing
the likelihood or doing MCMC.
The marginalization is standard practice when fitting microlensing events because it reduces the number of parameters in the model by the number of linear parameters in the model. For light curves consisting of multiple independent observations (for example, those observed by different observatories) this is a necessary step because there could be dozens of linear parameters in the model. The way this is traditionally done in microlensing is by solving for the linear parameters conditional on fixed values of the nonlinear parameters using linear least squares. This procedure is justified in some circumstances but it is not valid if the data covariance matrix is dense (for example, if we are using a Gaussian Process to model correlated noise in the light curves or stellar variability of the source star). Below, I show how to analytically marginalize over the linear parameters in the general case.
The observed flux \(\mathbf f\) can be written as a linear model \begin{equation} \mathbf f = \mathbf M\,\boldsymbol\beta \end{equation} where \(\mathbf M\) is the design matrix which depends on some nonlinear parameters \(\boldsymbol\theta\):
Assuming that we place a Gaussian prior on the linear parameters \(\boldsymbol\beta\) with mean \(\boldsymbol\mu\) and covariance \(\boldsymbol\Lambda\), one can show that the marginal likelihood \(\ln\int p(\mathbf f|\boldsymbol\theta)\,p(\boldsymbol\beta|\boldsymbol\theta)d\boldsymbol\beta\) is given by
To compute the inverse and the determinant of the covariance matrix of marginalized likelihood we can use the matrix inversion lemma (see Appendix A3 of R&W):
However, if we marginalize over the linear parameters, how can we obtain their values in order to, for example, produce a plot of the model fit? The answer, from the paper linked above, is that the distribution over \(\boldsymbol\beta\) conditional on particular values of the nonlinear parameters \(\boldsymbol\theta\) is a Gaussian \(\mathcal{N}(\boldsymbol\beta;\mathbf a,\mathbf A)\) where
If we had posterior samples of the nonlinear parameters \(\boldsymbol\theta\) we could could use this distribution to generate samples of the linear parameters.
In marginalized_log_likelihood
I use the standard least squares solution for \(\boldsymbol\beta\)
if the data covariance matrix is diagonal because it's faster than the matrix
operations in \(\mathrm{LL}(boldsymbol\theta)\). Otherwise I use the full analytic marginalization.
caustics.linalg
¤
marginalized_log_likelihood(A_list, fobs_list, C_inv_list, dense_covariance = False, Lam_sd = 10000.0)
¤
Compute the log-likelihood for a microlensing light curve marginalized over over the linear flux parameters \(\boldsymbol\beta\equiv(F_s, F_b)^\intercal\). The function takes a list of magnification vectors, a list of observed fluxes and a list of inverse data covariance matrices as an input, one element of the list represents and independent set of observations, for instance light curves from different observatories. The total log-likelihood is then a sum of the log-likelihoods for each light curve.
If dense_covariance
is False, the inverse data covariance matrices
are assumed to be 1D vectors containing the elements on the diagonal. In
that case, the most efficient way to marginalize over the linear parameters
in the likelihood is to solve the linear least squares problem conditional
on fixed values of the nonlinear parameters (which determine the
magnification). If dense_covariance
is True, the inverse covariance matrices
are assumed to be dense matrices. In that case we have to compute the full
marginal likelihood. This increases the computational cost of the likelihood
evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
A_list |
list[array_like]
|
List of 1D magnification arrays, one per independent observation. |
required |
fobs_list |
list[array_like]
|
List of 1D observed flux arrays. |
required |
C_inv_list |
list[array_like]
|
List of nverse data covariance matrices.
If |
required |
dense_covariance |
bool
|
Flag indicating whether |
False
|
Lam_sd |
float
|
Standard deviation of the Gaussian prior on
the linear flux parameters. The prior is assumed to be a zero mean
independent Gaussian with standard deviation |
10000.0
|
Returns:
Name | Type | Description |
---|---|---|
tuple | If |
|
\((\boldsymbol\beta, \mathrm{LL}(\boldsymbol\theta))\) containing the |
||
least-squares solution for the linear parameters and the log-likelihood. |
||
Otherwise, the first element of the tuple is 0. |