Short topical video
- Efron, Bradley; Tibshirani, Robert J. (1993). An introduction to the bootstrap
- Bootstrapping (wikipedia)
Bootstrap resampling is a statistical technique to measure the error in a given statistic that has been computed from a sample population. It is a simple yet powerful methord that relies heavily on computational power. The basic premise is that instead of using a theoretical or mathematical model for the parent distribution from which our observed samples were drawn from, we can use the distribution of the observed samples as an approximation for the parent distribution. This allows us to estimate the standard error and confidence intervals of an estimator without any knowledge of the shape of the parent distribution. This proves especially useful when the theoretical distribution of the statistic is complex or even unknown.
Let’s say we observe data samples, denoted as , and we want to compute a statistic . This statistic could be the mean or median of our samples, but could also be something much more complex. In measuring from our data, we want to know how close our estimator is to the true value of , so we need to compute an error estimate for . This can be done using the following bootstrap resampling algorithm:
- Make a bootstrap sample by sampling with replacement from the original data samples. This bootstrap sample should also be of length and may contain repetitions of the same data sample (since we sampled with replacement).
- Repeat this process and create bootstrap samples. Generally, , in order to reduce the amount of random scatter in the measurement of the bootstrap error.
- Compute the same desired statistic for each of the bootstrap samples, , where ranges from 1 to . We will call the quantities our bootstrap replications.
- From the bootstrap replications, compute the bootstrap variance of the measured value as
where the mean of the bootstrap replications is given by
The use of sampling with replacement works because we are approximating the true parent distribution of our sample with the actual sample values. However, the method essentially swaps out statistically independent samples with values that are correlated with each other (i.e., the same value more than once). Thus, the resampling algorithm cannot reproduce the same noise absolutely, but in the large limit, the resampled estimate of will asymptote to its true value. It is also important to note that the method of sampling with replacement relies on the assumption that each of the samples is drawn from an identical parent distributions and are statistically independent. In the case of dependent data, more sophisticated resampling techniques are needed.
In the limit as , the distribution of the bootstrap replications will asymptote to a normal or Gaussian distribution. So, in the limit of large , the standard deviation measured from the distribution of bootstrap replications, can be treated as the standard deviation of a normal distribution. In particular, will mark the 68.3% confidence interval on the measurement of , as is usual for confidence intervals of a normal distribution. However, in the limit of small , the distribution of the bootstrap replications does not have to resemble a normal distribution, and it will likely not. In this case, cannot be interpreted as marking the 68.3% confidence interval. In this case, the cumulative probability distribution (CDF) of the bootstrap replications can be used in order to measure confidence intervals. For example, for a 90% confidence interval, the CDF can be used to find the bounds such that 5% of the bootstrap replications are below and 5% are greater than . This method of computing confidence intervals using bootstrap resampling is the most basic. It has been shown that on average, these simple confidence estimates will be slightly too narrow. There have been new and better methods developed that will produce correct results over a broader range of problems. These methods are beyond the scope of this description, but for those interested, one of the main algorithms used is known as the bias-corrected and accelerated algorithm.
Assessing the error in
In general, there will be two sources of error associated with the measurement of using the bootstrap resampling algorithm. The first source is sampling variability caused by the fact that we are sampling the parent distribution with a finite number of samples, denoted by . The second source of error is due to resampling variability caused by the fact that we only take resamples and not an infinite number. It can be shown that the variance of will have two terms
In the limit of small sample size, the variance of the bootstrap error will be larger, due simply to sample variability. We can reduce the resampling variability by choosing large , and generally one chooses . In practice, the choice of should be the largest possible value possible given time/computational constraints.