# 10/23/14 Quadratic Estimators I

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

<latex> \documentclass[11pt]{article}

\def\hf{\frac12} \def\dcube#1{{d^{3}\mathbf{#1}}} \def\ddtau#1{{\frac{d#1} {d\tau}}} \def\bra#1{\langle #1|} \def\ket#1{|#1\rangle} \def\expval#1{\langle #1 \rangle} \def\rhor{\rho(\mathbf{r})}

\def\deltar{\delta(\mathbf{r})} \def\deltarp{\delta(\mathbf{r}^{\prime})} \def\deltaru#1{\delta_{#1}(\mathbf{r})}

\def\deltak{\delta(\mathbf{k})} \def\deltaku#1{\delta_{#1}(\mathbf{k})}

\def\fdeltak{\tilde{\delta}(\mathbf{k})} \def\fdeltakp{\tilde{\delta}(\mathbf{k}^{\prime})} \def\fdeltaku#1{\tilde{\delta}_{#1}(\mathbf{k})} \def\vec#1{\mathbf{#1}}

\newcount\colveccount \newcommand*\colvec{

       \global\colveccount#1
\begin{pmatrix}
\colvecnext


} \def\colvecnext#1{

       #1
\ifnum\colveccount>0
\\
\expandafter\colvecnext
\else
\end{pmatrix}
\fi


}

\usepackage{fullpage} \usepackage{amsmath} \usepackage{commath} \usepackage{amsfonts} \usepackage[margin=1in]{geometry} \usepackage{graphicx,rotating}

\begin{document}

\title{Quadratic Estimators: Part I}

\section{preamble} We have learned, in the previous lectures, about the power spectrum. By definitiion we have that, in general,

\begin{equation}

   \expval{\tilde{T}^{\*}_{b}(\mathbf{k}) \tilde{T}^{\*}_{b}(\mathbf{k}^{\prime})} =


(2\pi)^{3}\delta^{D}(\mathbf{k} - \mathbf{k}^{\prime})P(\mathbf{k}). \end{equation}

And for the CMB, that is for a 2 dimensional shell, we can describe the power spectrum by $C_{l}$ from from a spherical harmonic decompostion

\begin{equation}

   \expval{a_{lm}a_{l^{\prime}m^{\prime}}} = C_{l}\delta_{ll^{\prime}}\delta_{mm^{\prime}}


\end{equation}

We will now develop some formalism for quadratic estimators. This is especially useful for the power spectrum, a quadratic quantity.

\section{Quadratic Estimator Formalism} Let us take our data vector $\mathbf{x}$, given by,

\begin{equation}

   \mathbf{x} = \colvec{5}{\mathbf{x_{1}}}{\mathbf{x_{2}}}{.}{.}{.},


\end{equation}

where each ${\mathbf{x_{i}}}$ can be temperature data (${\mathbf{T_{i}}}$), visibility data (${\mathbf{V_{i}}}$), or some other quantity. The quadratic estimator is defined as

   \hat{p}_{\alpha} \propto \vec{x}^{\dagger}E^{\alpha}\vec{x} - b^{\alpha}.


\end{equation}

In equation \ref{eqn:quadest}, $\hat{}$ means the quantity hatted is the estimator, $p_{\alpha} \equiv P({k_{\alpha}})$, $\vec{x}$ is the data input vector, $E^{\alpha}$ is some matrix which is the data analysts choise, and $b^{\alpha}$ is the bias correction.

\section{What's with the ${E^{\alpha}?}$} Now that we have defined the quadratic estimator in equation \ref{eqn:quadest}, we can discuss what a sensible choise for $E^{\alpha}$ may be. The choice for $E^{\alpha}$ depends on the choice of basis $\vec{x}$ is expressed in. Note that $E^{\alpha}$ is a family of matrices, one for each $\alpha$. Therefore, to measure the the power at some $k$-mode $k_{i}$, we must use the $E^{\alpha}$ matrix that corresponds to that $k$-mode.

\subsection{example 1: $T(\vec{r_{i}})$}

   Suppose that the data vector $\vec{x} = ( T(\vec{r}_{1}), T(\vec{r}_{2}),  ...)$. Let us guess that

\begin{equation}

   E^{\alpha}_{ij} \sim e^{i\vec{k}_{\alpha}(\vec{r}_{i} - \vec{r}_{j})}.


\end{equation}

Why is this a good guess? Well we want to make an estimate of the power spectrum from temperature data. Back from the power spectrum lectures, we know that we need to fourier transform temperature data (a function of $\vec{r}$) in order to go into $k$-space. Hence $E^{\alpha}$ should look something like a fourier transform.

In order to check this, let us compute the estimator, \begin{align}

   \vec{x}^{\dagger} E^{\alpha} \vec{x} &= \sum_{ij}{x_{i}E_{ij}^{\alpha}x_{j}}\\
&= \sum_{ij}{T(\vec{r}_{i})e^{i\vec{k}_{\alpha}(\vec{r}_{i} - \vec{r}_{j})}T(\vec{r}_{j})}\\
&= ( \sum_{i}{e^{i\vec{k}_{\alpha}\vec{r}_{i}}T(\vec{r}_{i})} ) ( \sum_{j}{e^{i\vec{k}_{\alpha}\vec{r}_{j}}T(\vec{r}_{j})} )\\
&= \abs{\sum_{j}{e^{-i\vec{k} _{\alpha}\vec{r}_{j}}T(\vec{r}_{j})}}^{2}.


\end{align}

Hence, the choice of $E^{\alpha}$ above gives us a fourier tranform-ish result. This result would be a true fourier transform if the data were on a regular grid.

\subsection{example 2: $\tilde{T}(\vec{k})$}

   In this case $\vec{x} = ( \tilde{T}(\vec{k}_{1}), \tilde{T}(\vec{k}_{2}),  ...)$. We can then choose an $E^{\alpha}$ that extracts particular ${k}$-modes. Remember that $\alpha$ is a particular $k$-mode. Specifically, we have that $E^{alpha=k} = \delta{k}\delta_{i}\delta{j}$, so that the matrix is equal to 1 only when $i=j=k$.

\section{The general form of $E$}

Suppose that we have a power spectrum that was binned up. That is we model $P(k)$ as piecewise constant. The value of $P(k_{\alpha}) \equiv \alpha^{th}$ band power $\equiv P_{\alpha}$. Now to relate this to the data, consider the covariance matrix of the data: $C \equiv \expval{\vec{x}\vec{x}^{\dagger}}$; $C_{ij} = \expval{x_{i}x^{*}_{j}}$. The covariance matrix $C$ consists of two parts:

\begin{equation}

   C = N + \sum_{\alpha}{p_{\alpha}\frac{\partial{C}}{\partial{p_{\alpha}}}}
\equiv N  + \sum_{\alpha}{p_{\alpha}C_{,\alpha}},


\end{equation}

where the first term $N$ is the noise term and consists of instrumental noise, systematics, etc.... The second term, the partial derivative of the covariance matrix with respect to $\alpha$ is the crucial link between theory and data. This term is related to $E^{\alpha}$.

We note that the unvierse is not random and hence the power spectrum ($p_{\alpha}$) is not random. However, the quadratic estimator $\hat{p}_{\alpha}$ is random due to our instrument. Because of this we can measure the variance in our estimate of the power spectrum. Lets do that... The variance of the power spectrum in the $\alpha$-th bin is given by

\begin{align}

   \expval{\hat{p}_{\alpha}} &= \expval{\vec{x}^{\dagger}E^{\alpha}\vec{x}} -


b_{\alpha}\\

   &= \sum_{ij}\expval{x^{*}_{i}E_{ij}^{\alpha}x_{j}} - b_{\alpha}\\
&= \sum_{ij}E_{ij}^{\alpha}\expval{x_{j}x^{*}_{i}} - b_{\alpha}\\
&= \sum_{ij}E^{\alpha}_{ij}C_{ij} - b_{\alpha}\\
&= tr(E^{\alpha}C) - b_{\alpha}.


\end{align}

Plugging in for the covariance matrix, we get that \begin{align}

   \expval{\hat{p}_{\alpha}} &= tr[E^{\alpha}\sum_{\beta}p_{\beta}C_{,\beta}] +


tr[E^{\alpha}N] - b_{\alpha}\\

                             &= \sum_{\beta}tr[E^{\alpha}C_{,\beta}]p_{\beta} + tr[E^{\alpha}N] - b_{\alpha}.


\end{align}

Note that the second term, $tr(E^{\alpha}N)$ is the additive (noise) bias term. If we let $b_{\alpha} = tr[E^{\alpha}N]$ then we can correct fo this additive term. Hence, we are left with $\expval{\hat{p_{\alpha}}} = \sum{w_{\alpha\beta}p_{\beta}}$ where $w_{\alpha\beta} = tr[E^{\alpha}C_{,\beta}]$. This $w_{\alpha\beta}$ matrix is the window function matrix (or taper function matrix). The $\alpha$-th row of $w_{\alpha\beta}$ gives linear cominations of $P(k_{\beta})$ probed by $\hat{p}_{\alpha}$. This gives us horizontal errorbrs.

\section{Vertical Error Bars} The window (or taper) functions above gave us the horizontal error bars, which is to say for each power spectrum measurement in the $\alpha$-th bin, the amount other bins contributed to that measurement.

Now, we discuss how to get the vertical error bars in the power spectrum estimate, which is basically the covariance of $\hat{p}_{\alpha}$. Let us calculate this,

\begin{align}

   V_{\alpha\beta} &= \expval{\hat{p}_{\alpha}\hat{p}_{\beta}} -
\expval{\hat{p}_{\alpha}}\expval{\hat{p}_{\beta}}\\
&=


\sum_{ijkl}{\expval{x_{i}E_{ij}^{\alpha}x_{j}x_{k}E^{\beta}_{kl}x_{l}}} - \sum_{ijkl}{\expval{x_{i}x_{j}}E_{ij}^{\alpha}} \sum_{ijkl}{\expval{x_{k}x_{l}}E_{kl}^{\beta}}, \end{align}

where terms $\expval{x_{i}x_{j}}$ are $C_{ij}$. If we assume that x is Gaussian distributed, we can use wicks theorem (or Issirelli's Theorem; see wikipedia) to calculate the 4 point function above in terms of two point functions. This gives us that the vertical error bars are

\begin{equation}

   V_{\alpha\beta} = 2tr[CE^{\alpha}CE^{\beta}]


\end{equation}

\section{A Concrete Example : The 1-D Universe and a noiseless instrument} Consider a 1-D universe from $\frac{-L}{2}$ to $\frac{L}{2}$ with some pixelation (meaning discrete bins of size $\Delta{r}$). Then a measurement in this universe, $x_{i}$, is given by \begin{equation}

   x_{i} = \int{dr T(r) \Phi_{i}(r)}


\end{equation} where $\Phi_{i}$ is some pixelization function. We take it to be 1 inside the $i$-th cell and 0 outside. Let us now estimate $p_{\alpha}$ in this universe.

$$ \hat{p}_{\alpha} = \vec{x}^{\dagger}E^{\alpha}\vec{x} $$ Note that we are not including the noise bias $b_{\alpha}$ here and therefore are not considering noise.

A good choice for $E_{\alpha}$ in this situation would be one that is proportional to $C_{,\alpha}$. Previously we had that the expectaion value of the estimator is

\begin{align}

   \expval{\hat{p}} &= \sum_{ij}{E^{\alpha}_{ij}C_{ji}} \\
&\propto \sum_{ij}{(C_{,\alpha})_{ij}C_{ij}}


\end{align}

Now, lets compute the error bars. Lets start by first going into fourier space. Then, \begin{align}

   \vec{x}_{i} &= \int{dr\Phi_{i}(r)}\int{\frac{dk}{2\pi}\tilde{T}(k)e^{ikr}}\\
&= \int{\frac{dk}{2\pi}\tilde{T}(k)}\int_{r_{i} -


\frac{\Delta{r}}{2}}^{r_{i}+\frac{\Delta{r}}{2}}{e^{ikr}}\\

               &=


\int{\frac{dk}{2\pi}\tilde{T}(k)\Delta{r}e^{ikr_{i}}sinc(k\frac{\Delta{r}}{2})}, \end{align} by noting that $sinc(nx) = \frac{1}{2n}\int{e^{ixt}dt}$.

Then the covariance ($C_{ij} = \expval{x_{i}x_{j}^{*}}$) is given by,

\begin{align*}

   C_{ij} &=


\int{\frac{dk}{2\pi}\frac{dk^{'}}{2\pi}\expval{\tilde{T}(k)\tilde{T}^{*}(k^{'})}(\Delta{r})^{2}e^{ikr_{i}}e^{-ik^{'}r_{j}}sinc(k\frac{\Delta{r}}{2})sinc(k^{'}\frac{\Delta{r}}{2})}\\

          &=


\int{\frac{dkdk^{'}}{2\pi}\delta(k-k^{'})P(k)(\Delta{r})^{2}e^{ikr_{i}}e^{-ik^{'}r_{j}}sinc(k\frac{\Delta{r}}{2})sinc(k^{'}\frac{\Delta{r}}{2})}\\

          &=


\int{\frac{dk}{2\pi}P(k)(\Delta{r})^{2}e^{ik(r_{i}-r_{j})}[sinc(k\frac{\Delta{r}}{2})]^{2}}\\

          & \text{Then assuming $\Delta{k}$ is very small, and $P(k)$ is


piecewise and pulling $p_{\alpha}$ out of the integral,}\\

          &\approx


\sum_{\alpha}{p_{\alpha}\frac{\Delta{k}}{2\pi}(\Delta{r})^{2}e^{ik_{\alpha}(r_{i}-r_{j})}[sinc(k_{\alpha}\frac{\Delta{r}}{2})]^{2}} \end{align*}

Therefore, $C_{,\alpha} = \frac{\partial{C}}{\partial{p_{\alpha}}} \propto e^{ik_{\alpha}(r_{i} - r_{j}}[sinc(k_{\alpha}\frac{\Delta{r}}{2}]^{2}$. Plugging into the equation for the taper functions we get,

\begin{align*}

   w_{\alpha\beta} &\propto tr[C_{,\alpha}C_{,\beta}] \\
&\propto \abs{\sum_{\alpha\beta}{e^{i(k_{\alpha} -


k_{\beta})r_{i}}[sinc(k_{\alpha}\frac{\Delta{r}}{2}]^{2}][sinc(k_{\beta}\frac{\Delta{r}}{2}]^{2}}} \end{align*}

Remember, that the $\alpha$-th row tells us what linear combination of $k$ calue are probed by $\hat{p_{\alpha}}$. So, we want the $w_{\alpha\beta}$ large if $k_{\alpha} \approx k_{\beta}$ and that $w_{\alpha\beta}$ small if $k_{\alpha}$ is very different from $k_{\beta}$. This works out with the above window (taper) functions above. We then normalize $w_{\alpha\beta}$ so that each row sums to 1.

\end{document}