Testing the law of large numbers in R
I've been taking a online course in R programming. The long term hope with learning R is that I'll be more competitive in the job market, in comparison to my peers. But, in the short term I've been learning a lot. And one of the homework assignments was to test the law of large numbers with a program I coded. I was successful so I thought I'd share my results.
But first:
What is the law of large numbers?
The law of large numbers is a fairly simple idea. When you take a sample of a large group, there will be two means. The sample mean, gathered from the survey/sample, and the true mean for the population. As you increase the sample size your mean gets more accurate or it approaches the true population mean.
Or in other words, adding more people/data into your sample increases the chances that your result will be accurate (assuming the people are randomly selected).
Formula - Picture from Wikipedia
But the challenge I was given was testing this idea in R.
The way I accomplished it was by using two variables, a for loop, and a if statement.
I used the following code:
N <- 10
counter <- 0
for(x in rnorm(N)){
if(x > -1 & x < 1){
counter <- counter + 1}
}
answer <- counter / N
answer
Then all you have to do is change the variable N to increase or decrease it in size.
Here are the results for various N variables. The smaller the value of N, the greater the variation. Try it in your own Rstudio.
N is 10, answer is 0.5
N is 100, answer is 0.63
N is 1000, answer is 0.704
N is 10000, answer is 0.684
N is 1000000, answer is 0.683169
N is 10000000, answer is 0.682751
So as we can see the population mean is 0.682
We can continue to increase the value of N but I think we see the point here. It was a fun little program to write!
You aren't in my statistic course, aren't you? :p
Just talked about this at university today. Our leader of the group recommended to use R too - didn't think about the possibility to have a small advantage in the job market if I know how to use it to.. Should look into this, thx :)
(for my own interest: more statistics articles please! :D will follow you in hope :p)
haha, I'm not sure! I might be! :D
R does a great job at stats and data analysis. Straight programming through it seems to be a little difficult. But I've heard that the best combination of skills is python, mysql, and R for stat heads/economics people. But that's more anecdotal.
You should really look into learning a programming language - even stata - as jobs will certainly be asking you about it.
And I plan to! I've really enjoyed it so far.
Actually, I'm studying computer science (combined with psychology) , so I already speak some computer languages :p
Cool thing, good job :) keep it on