# Central Limit Theorem

Updated: Jan 16

## Introduction: What is the Central Limit Theorem?

The Central Limit Theorem (CLT) is a statistical theorem that states that for a large enough sample size, the distribution of the sample mean will be approximately normally distributed, regardless of the underlying distribution of the population from which the sample is drawn.

This means that as the sample size increases, the sample mean will become more and more like a normal distribution, even if the population itself is not normally distributed.

The theorem is important because it allows statisticians to make predictions and inferences about a population based on a sample, even if the population is not normally distributed.

This is useful in many fields, such as finance, engineering, and social sciences, where data is often not normally distributed.

The Central Limit Theorem has some assumptions, such as the sample size should be large enough and the samples should be independent and identically distributed, also known as i.i.d assumption.

In short, the Central Limit Theorem is a fundamental principle of statistics that allows analysts to make predictions and inferences about a population based on a sample, even if the population is not normally distributed, as long as the sample size is large enough and the samples are independent and identically distributed.

## Central Limit Theorem Explained with Example

An example of the Central Limit Theorem (CLT) in action is as follows:

Suppose we have a population of 1000 randomly generated numbers, which are uniformly distributed between 0 and 1. We want to know what the mean and standard deviation of this population is. We can't directly calculate the mean and standard deviation of the population because we don't have all the data, so we take a sample of size 30 and calculate the mean and standard deviation of that sample. We repeat this process 10,000 times and plot the distribution of the sample means.

As we repeat the process, we would expect to see that the sample mean is approximately normally distributed, with a mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size. Furthermore, as we increase the sample size, the distribution of the sample mean will become more and more like a normal distribution, even though the population itself is not normally distributed.

This is an example of how the Central Limit Theorem allows us to make predictions and inferences about a population based on a sample, even if the population is not normally distributed, as long as the sample size is large enough.

It's worth noting that in practice, the population parameters (mean, standard deviation) are often unknown and need to be estimated from the sample.

## Central Limit Theorem Formula?

The Central Limit Theorem (CLT) does not have a specific formula as it is a statistical theorem that describes the behavior of a sample mean as the sample size increases.

However, the CLT can be used to calculate the standard deviation of the sample mean, also known as the standard error of the mean, which is given by the formula:

Standard error of the mean = (standard deviation of population) / (square root of sample size)

This formula assumes that the population standard deviation is known, but in practice it is often unknown and needs to be estimated from the sample. In this case, the sample standard deviation is used as an estimate.

Additionally, it is important to note that the CLT only holds for large sample sizes, typically a sample size of 30 or more is considered large enough for the CLT to be applicable.

The Central Limit Theorem is a theoretical concept that describes the behavior of a sample mean, it does not have a specific formula, but it can be used to calculate the standard error of the mean, which is a useful tool for making predictions and inferences about a population based on a sample.

## Central Limit Theorem Proof

The Central Limit Theorem (CLT) is a fundamental principle of statistics that states that for a large enough sample size, the distribution of the sample mean will be approximately normally distributed, regardless of the underlying distribution of the population from which the sample is drawn.

The proof of the CLT is based on mathematical concepts such as probability theory, characteristic functions and moment-generating functions.

One way to prove the CLT is to use the characteristic function, which is a function that describes the probability distribution of a random variable. The characteristic function of the sample mean can be shown to be the nth power of the characteristic function of the population, where n is the sample size. If the population has a finite mean and a finite variance, the characteristic function of the sample mean will converge to a normal distribution as the sample size increases.

Another way to prove the CLT is by using the moment-generating function, which is a function that describes the expected value of the random variable raised to a power. The moment-generating function of the sample mean can be shown to be the nth power of the moment-generating function of the population, where n is the sample size. If the population has a finite mean and a finite variance, the moment-generating function of the sample mean will converge to a normal distribution as the sample size increases.

The proofs of the CLT are based on mathematical concepts, they are quite technical and require knowledge of advanced mathematical concepts such as characteristic function and moment-generating function.

In summary, The Central Limit Theorem states that as the sample size increases, the sample mean will become more and more like a normal distribution, even if the population itself is not normally distributed. The theorem is important because it allows statisticians to make predictions and inferences about a population based on a sample, even if the population is not normally distributed. The proof of the theorem relies on the properties of characteristic functions and moment-generating functions, which states that if the population has a finite mean and a finite variance, the characteristic function of the sample mean will converge to a normal distribution as the sample size increases.