Testing the central limit theorem
I wanted to test that the mean over a large number of samples can be modeled adequately with a normal distribution.
Further, I wanted to see if the theorem held up when the original distribution was heavily skew.
To investigate:
means<-numeric(10000) //somewhere to store the means
for(i in 1:10000){means[i]<-mean(rbinom(n=10000, prob=.1, size=5))} //generate 10000 random binomial distributions, storing each of their means, note I use prob=.1 producing a heavily skew distribution
h<-hist(means) //plot the distribution
Now lets overlay the idealised normal curve.
x<-seq(0.4766, 0.5262,.0001) //generate the x values, use range(means)
y<-dnorm(x,mean=mean(means),sd(means))*.005*10000 //scale the y values by the bin width and the number of samples (look at the =h= for bin width).
Normal approximation looks pretty good.
And the probabilty plot looks like this:
plot(sort(means), qnorm(seq(1/10000, 1,1/10000)))
Again looks very good.
|