Roman KlimenkoBlogPhotography

Central limit theorem

June 14, 2018

statisticsprobabilitynormal-distributiondata

In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a “bell curve”) even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.

Wikipedia

To grokk this, let’s generate a sample of 1.000.000 random numbers from 0 to 100 and draw a chart where the x-axis represents a random number and the y-axis — the number of times the number occurs in our sample:

clt 0

Looks like more or less equally distributed.

Now let’s generate another sample by using the same random() function, but to generate each number, we will add one random result to another: random() + random():

clt 1

This seems like an angle, let’s add more random numbers together: random() + random() + random():

clt 2

Let’s sum up five random numbers together:

clt 3