Difference between revisions of "Monte Carlo Methods"

From AstroBaki
Jump to navigationJump to search
(Migrated over from "Introduction to Numerical Methods")
 
m (Add some reference materials)
Line 1: Line 1:
 +
===Short Topical Videos===
 +
* [https://www.youtube.com/watch?v=M9hjdX97Z3s Monte Carlo Random Sampling (iman)]
 +
 +
===Reference Materials===
 +
* [https://en.wikipedia.org/wiki/Monte_Carlo_method Monte Carlo Method (Wikipedia)]
 +
* [https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Markov Chain Monte Carlo (Wikipedia)]
 +
 
<latex>
 
<latex>
  

Revision as of 11:09, 21 December 2020

Short Topical Videos

Reference Materials

<latex>

\documentclass[12pt]{article}

\usepackage{amsmath} \usepackage{amssymb}

\begin{document}

\section{ Monte Carlo Methods }

Monte Carlo methods are algorithms that use repeated random sampling to obtain results. In this section, we will discuss how to do this sampling manually in python and touch upon the more complicated subject of Markov Chain Monte Carlo algorithms.

\subsection*{ Monte Carlo Sampling }

Monte Carlo (MC) sampling is generally used when one wants to simulate a population of a continuous random variable that is described by some known probability distribution. Sure, there exist python packages that make this easy (see: Random Sampling with Numpy). But what if you are dealing with a probability distribution that isn't supported by a python package, or if you aren't using python altogether? Because of situations like this, it is useful to know how to do this sampling manually.

To do MC sampling manually, you must first compute the cumulative distribution corresponding to your probability distribution. The cumulative distribution function of a continuous random variable $X$ is given by:

$$ F_X(x) = A \int_{-\infty}^x f_X(t) dt $$

where $f_X(t)$ is the probability density function and $A$ is a normalization factor. It is necessary to normalize the probability density function because we want the maximum value of $F_X(x)$ (i.e., the sum of the probabilities of all possible outcomes) to be 1. Once you determine $A$ for your $f_X(t)$, you can rearrange the equation to solve for $x$. Last, you can simulate a population of $X$ by randomly drawing values of $F_X(x)$ between 0 to 1. Let's walk through a few examples to clarify.

\\[12pt]

$\textbf{Example \, 1:}$

First, we will imagine a scenario where our continuous random variable is uniformly distributed between 0 and 100. In this case, the unnormalized probability density function is simply:

$$ f(t) = 1 .$$

Solving for the normalization factor $A$, we get:

$$ A = \frac{1}{\int_{0}^{100} f(t) dt} = \frac{1}{100} .$$

Last, we can solve for $x$ using the equation above:

$$ F(x) = \frac{1}{100} \int_{0}^x f(t) dt = \frac{x}{100} \rightarrow x = 100 \, F(x) .$$

By randomly selecting multiple values of $F(x)$ between 0 and 1, we will obtain a population distributed in a uniformly between 0 and 100, just like the parent probability distribution.

\begin{verbatim}

. import numpy as np . import matplotlib.pyplot as plt . . # PDF and CDF . x_dist = np.linspace(0,100,int(1e4)) . f_dist = np.full_like(x_dist, 1) . F_dist = np.cumsum(f_dist) . F_dist /= max(F_dist) . . # sampling procedure . def sample(F): . x = 100*F . return x . . N = int(1e4) . F_samples = np.random.rand(N) . x_samples = sample(F_samples) . . # plotting routine . plt.figure(figsize=(15,4)) . ax1, ax2, ax3 = plt.subplot(131), plt.subplot(132), plt.subplot(133) . ax1.plot(x_dist, f_dist) . ax1.set_xlabel('t', fontsize=12) . ax1.set_ylabel('f(t)', fontsize=12) . ax2.plot(x_dist, F_dist) . ax2.set_xlabel('x', fontsize=12) . ax2.set_ylabel('F(x)', fontsize=12) . ax3.hist(x_samples, bins=np.linspace(0,100,25), edgecolor='black') . ax3.set_xlabel('x', fontsize=12) . ax3.set_ylabel('frequency', fontsize=12) . plt.tight_layout() . plt.show()

\end{verbatim}

\begin{center} MC1.png \end{center}

The plot on the left shows the (uniform) probability distribution. The plot in the center shows the cumulative distribution. The histogram on the right shows the population simulated with our sampling procedure for 1e4 samples.

\\[12pt]

$\textbf{Example \, 2:}$

Next, let's do a more physical example. Say you want to simulate a cloud of hydrogen atoms by sampling their velocities. If we assume the atoms are collisional, we can say that the probability density function of their velocities is Maxwellian:

$$ f(v) = v^2 e^{-\frac{m v^2}{2 k T}} .$$

The normalization factor is thus:

$$ A = \frac{1}{\int_{0}^{\infty} f(v) dv} = \frac{(\frac{m}{k T})^{3/2}}{(\pi/2)^{1/2}} .$$

And so our sampling equation is:

$$ \int_{0}^{x} f(v) dv = \frac{(\pi/2)^{1/2}}{(\frac{m}{k T})^{3/2}} \, F(x) .$$

We cannot solve for $x$ directly here, but we can use some coding tricks to complete the sampling procedure.

\begin{verbatim}

. import numpy as np . import matplotlib.pyplot as plt . from scipy.special import erf . . # constants . m = 1e-24 # g . k = 1e-16 # erg/K . T = 1e2 # K . . # PDF and CDF . v_dist = np.linspace(0,int(5e5),int(5e5)) . f_dist = v_dist**2 * np.exp(-(m*v_dist**2)/(2*k*T)) . F_dist = np.cumsum(f_dist) . F_dist /= max(F_dist) . . # sampling prodecure . # calculate LHS of the equation for many possible values of x, then select the x that gives LHS = RHS . LHS = (np.pi/2)**0.5 * (k*T/m)**1.5 * erf(v_dist * (m/(2*k*T))**0.5) - (k*T*v_dist/m)*np.exp(-(m*v_dist**2)/(2*k*T)) . def sample(F): . RHS = F * (np.pi/2)**0.5 * (m/(k*T))**(-1.5) . idx = np.argmin(abs(LHS-RHS)) . x = v_dist[idx] . return x . . N = int(1e4) . F_samples = np.random.rand(N) . x_samples = np.zeros_like(F_samples) . for i in range(len(F_samples)): . x_samples[i] = sample(F_samples[i]) . . # plotting routine . plt.figure(figsize=(15,4)) . ax1, ax2, ax3 = plt.subplot(131), plt.subplot(132), plt.subplot(133) . ax1.plot(v_dist, f_dist) . ax1.set_xlabel('v (cm/s)', fontsize=12) . ax1.set_ylabel('f(v)', fontsize=12) . ax2.plot(v_dist, F_dist) . ax2.set_xlabel('x (cm/s)', fontsize=12) . ax2.set_ylabel('F(x)', fontsize=12) . ax3.hist(x_samples, bins=np.linspace(0,5e5,25), edgecolor='black') . ax3.set_xlabel('x (cm/s)', fontsize=12) . ax3.set_ylabel('frequency', fontsize=12) . plt.tight_layout() . plt.show()

\end{verbatim}

\begin{center} MC2.png \end{center}

The plot on the left shows the (Maxwellian) probability distribution. The plot in the center shows the cumulative distribution. The histogram on the right shows the population simulated with our sampling procedure for 1e4 samples.

\subsection*{ Markov Chain Monte Carlo }

Markov Chain Monte Carlo (MCMC) algorithms are used to obtain random samples from probability distributions for which random sampling is difficult (e.g., high-dimensional distributions). Briefly, MCMCs work by placing "walkers into the parameter space. With each iteration, these walkers take a random step such that the sampled distribution of each parameter converges to the proposed distribution for that parameter. One common application of MCMCs is parameter estimation, which can be done by considering the mean and variance of each parameter's sample distribution after convergence is achieved. You will not need to use MCMC for C207, but it may come in handy in the future.

If you wish to learn more about how to implement MCMCs in python, check out PyMC3 and emcee.

\end{document}

<\latex>