Short topical video
- Efron, Bradley; Tibshirani, Robert J. (1993). An introduction to the bootstrap
- Bootstrapping (wikipedia)
Bootstrap resampling is a statistical technique to measure the error in a given statistic that has been computed from a sample population. It is a simple yet powerful methord that relies heavily on computational power. The basic premise is that instead of using a theoretical or mathematical model for the parent distribution from which our observed samples were drawn from, we can use the distribution of the observed samples as an approximation for the parent distribution.
Let’s say we observe data samples, denoted as , and we want to compute a statistic . This statistic could be the mean or median of our samples, but could also be something much more complex. In measuring from our data, we want to know how close our estimator is to the true value of , so we need to compute an error estimate for . This can be done using the following bootstrap resampling algorithm:
- Make a bootstrap sample by sampling with replacement from the original data samples. This bootstrap sample should also be of length and may contain repetitions of the same data sample (since we sampled with replacement).
- Repeat this process and create bootstrap samples. Generally, , in order to reduce the amount of random scatter in the measurement of the bootstrap error.
- Compute the same desired statistic for each of the bootstrap samples, , where ranges from 1 to . We will call the quantities our bootstrap replications.
- From the bootstrap replications, compute the standard deviation as
where the mean of the bootstrap replications is given by