Random Sampling

From AstroBaki
Revision as of 16:37, 31 December 2020 by C207 (talk | contribs) (Created page with '=== Reference Materials === * [https://www.amazon.com/Applied-Computational-Physics-Joseph-Boudreau/dp/0198708645 Applied Computational Physics (Boudreau and Swanson)] * [https:/…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Reference Materials[edit]

<latex>

\documentclass[12pt]{article}

\usepackage{amsmath} \usepackage{amssymb}

\begin{document}

\section{Random Sampling}\label{sec:rand_samp}

% Applied Comp. Physics

Random variables can be discrete or continuous. The former are represented by some discrete probability distribution, with a probability assigned to each value $x_i$ the random variable can take on. The latter are characterized by a probability density function (PDF), $\rho(x)$. Integrating $\rho(x) dx$ between some $x_1$ and $x_2$ gives the probability of the random variable taking on a value between $x_1$ and $x_2$. Both the distribution and the PDF are normalized -- the former obeys:

\begin{equation}\label{eq:normalization_disc} \Sigma_i P(x_i) = 1 \end{equation}

\noindent while the latter obeys the integral equation:

\begin{equation}\label{eq:normalization_cont} \int \rho(x)dx = 1 \end{equation}

\noindent For a multivariate distribution, the values $x, x_i \in \mathbb{R}$ map to some $\mathbf{x}, \mathbf{x_i} \in \mathbb{R}^n$, and the normalization condition \ref{eq:normalization_cont} becomes a multidimensional integral. Here we discuss the sampling of random variables given some underlying distribution. We will focus on continuous random variables.

\subsection{Uniform Distributions}

The most basic distribution is one that is uniform in $x$. We will see later that random numbers drawn from uniform distributions can be used to sample from distributions with more complicated forms. Most languages have some standard implementation of a uniform random number generator -- the C++ implementation is discussed below.

Random number generators are often not purely random, instead generating a completely deterministic sequence of numbers set by an integer (or set of integers) called a seed. The seed is supplied upon initialization of the generator. Generators of this type are called pseudorandom. This reproducibility is sometimes desired, but pseudorandom numbers are not statistically independent, so the programmer needs to adjust the seed when they require a statistically independent sample.

% Matsumoto 1998 https://dl.acm.org/doi/10.1145/272991.272995 The most widely used uniform pseudorandom number generator is the Mersenne twister, first described by Makoto Matsumoto and Takuji Nishimura in 1998 (Matsumota and Nishimura, 1998). Its C++11 standard library implementation is $\texttt{std::mt19937}$. For every pseudorandom number generator, number sequence begins to repeat after a certain number of values has been generated (called the cycle length). All widely used pseudorandom number generators have exceedingly long cycle lengths -- the cycle length of $\texttt{std::mt19937}$ is $2^{19937} - 1$.

Truly random numbers may be generated through measurements of stochastic natural processes, such as thermal or atmospheric noise or radioactive decay. The C++ $\texttt{std::random\_device}$ class implements a genuine random number generator, provided that a suitable non-deterministic source (e.g.\ environmental noise in device drivers) is available. This comes at the cost of some efficiency, however, and $\texttt{std::random\_device}$ is often only used to seed a pseudorandom number generator when statistically independent numbers are not needed.

\subsection{Rejection Sampling}

One widely applicable method for sampling from a non-uniform distribution is the rejection method. This method is straightforward and works for any distribution in $\mathbb{R}^n$ with a PDF, but it can be highly inefficient for some classes of PDF. Making use of the method requires the following components:

\begin{enumerate} \item A target distribution with PDF $\rho(\mathbf{x})$, $\mathbf{x} \in \mathbb{R}^n$. \item A proposal distribution with PDF $r(\mathbf{x})$ that can be sampled from, for which there exists a constant $C \in \mathbb{R}$ such that $C r(\mathbf{x}) \ge \rho(\mathbf{x})$ for all $\mathbf{x}$. \item Some means of generating uniform (pseudo)random numbers. \end{enumerate}

\noindent Once one has these components, the target PDF can be sampled from as follows:

\begin{enumerate} \item Sample an $\mathbf{x}$ value from the proposal distribution. \item Uniformly sample on the interval $[0, r(\mathbf{x})]$, and call the sampled value $y$. \item If $y < \rho(\mathbf{x})$, return $\mathbf{x}$. Otherwise, reject $\mathbf{x}$ and return to Step 1. \end{enumerate}

\noindent One can imagine the 1D version of this algorithm in action by envisioning a graph of the PDF of the target distribution in the x-y plane. If we assume a uniform proposal distribution, the target PDF will fit under some horizontal line representing $C r(x)$. Now uniformly generate a collection of points in the area under the horizontal line (sort of like throwing darts at a rectangular dart board). The points that fall under the target PDF will be accepted -- their x-values are sampled from the target distribution, with a higher rejection rate in areas of lower probability density.

The number of rejections will be roughly proportional to the area between the target and proposal PDFs (with the latter scaled by the constant). If this area is large, sampling will be very inefficient. Non-uniform proposal distributions that minimize this area can improve the overall efficiency of the algorithm by sharply reducing the number of rejections needed before a valid sample is obtained. Note that the PDFs here need not be normalized, so the rejection method can be used to sample from distributions for which a normalization constant cannot be easily obtained.

\subsection{Direct Sampling}

Direct sampling, also known as inversion sampling or the cumulative distribution function (CDF) method, is an alternative to rejection sampling. It is oftentimes the more efficient method, but there are stricter limitations on the distributions it can be used to sample from.

For the single variable case, take $\xi$ to be a uniform random variable on the interval $[0, 1]$, and $x$ to be the random variable we want to sample (with PDF $\rho(x)$). The goal of direct sampling is to use the uniform random variable $\xi$, which can be easily sampled, to sample from the distribution underlying $x$. To do so, we need to find a transformation $T$ to apply to $\xi$ so that the resulting PDF of $x = T(\xi)$ matches $\rho(x)$.

To describe our requirement another way, consider some set of points drawn from the distribution underlying $\xi$ that lie in an infinitesimal interval $[\xi', \xi' + d\xi']$. Then applying $T$ to $\xi$ should redistribute these points in the interval $[x', x' + dx']$, where the probability of finding $x$ in that new interval (equal to the chance of finding $\xi$ in the old interval) is $\rho(x') dx'$. Since $\xi$ is uniform on $[0, 1]$, the probability of it falling within the original interval is $d\xi'$. Setting this equal to $\rho(x') dx'$:

\begin{equation} d\xi' = \rho(x') dx' \rightarrow \int_0^\xi d\xi' = \int_{x_{min}}^x \rho(x') dx' \end{equation}

\noindent where $x_{min}$ is the minimum value $x$ may take on. That gives:

\begin{equation} \xi = F(x) \end{equation}

\noindent or:

\begin{equation} x = F^{-1}(\xi) \end{equation}

\noindent where $F(x) = \int_{x_{min}}^x \rho(x') dx'$ is the CDF of $x$.

\vspace{0.75 cm}

\noindent {\bf Example:} Consider the power law distribution:

\begin{equation} \rho(\gamma) = A \gamma^{-p} \end{equation}

\noindent with $p > 1$, $\gamma_{min} \le \gamma < \infty$, and $A = (p - 1) / \gamma_{min}^{1 - p}$. Applying $\xi = F(x)$ for some uniformly sampled $\xi$ value:

\begin{align} \xi &= A \left(\frac{\gamma^{1-p}}{1-p} - \frac{\gamma_{min}^{1-p}}{1-p}\right)\\ &= 1 - \left(\frac{\gamma}{\gamma_{min}}\right)^{1-p} \end{align}

\noindent giving for $\gamma$:

\begin{equation} \gamma = \gamma_{min} (1 - \xi)^{\frac{1}{1-p}} \end{equation}

To extend to the multivariable case, we need to replace the CDF in $\xi = F(x)$ with the marginal CDF of the variable we are interested in sampling. The marginal CDF can be obtained by replacing the integral in $\int_{x_{min}}^x \rho(x') dx'$ with a multidimensional integral, with all variables integrated over their full range except the variable of interest. This is demonstrated in the example below.

\vspace{0.75 cm}

\noindent {\bf Example:} Consider emission in a hemispherical dome with direction characterized by the PDF:

\begin{equation} \rho(\theta, \phi) = \frac{\cos \theta}{\pi} \end{equation}

\noindent with $0 \le \theta \le \pi/2$, $0 \le \gamma \le 2 \pi$. To sample $\theta$ from this distribution, we compute the multidimensional integral:

\begin{align} \int_{S_\theta} \rho(\theta, \phi) d\Omega &= \frac{1}{\pi} \int_0^{2\pi} \int_{0}^{\theta} \cos \theta \sin \theta d\theta d\phi\\ &= 1 - \cos^2 \theta \end{align}

\noindent If we set this equal to a uniformly generated $\xi_{\theta}$, we get for $\theta$:

\begin{equation} \theta = \cos^{-1}\left(\sqrt{1 - \xi_{\theta}}\right) \end{equation}

\noindent We can then do the same for $\phi$:

\begin{align} \int_{S_\phi} \rho(\theta, \phi) d\Omega &= \frac{1}{\pi} \int_0^{\phi} \int_{0}^{\frac{\pi}{2}} \cos \theta \sin \theta d\theta d\phi\\ &= \frac{\phi}{2 \pi} \end{align}

\noindent and so for some uniform $\xi_\phi$: \begin{equation} \phi = 2 \pi \xi_\phi \end{equation}

\noindent This example concludes our discussion of sampling methods.

\end{document}

<\latex>