Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Monte Carlo Simulation

    Hi all,

    I want to be able to run a Monte Carlo simulation for an econometrics assignment that I have. From what I can tell, I need to design an experiment that has sample sizes of n{10, 100, 1000, 10000]}, each requiring R=1000, where R= replications. For each of the sample sizes, I need to calculate the 25th, 50th, and 75th percentile values. I have some code from stata for this experiment and it listed below. The question I have is how can I create a loop within this program to allow for changing the n number of observations?

    clear

    local mc = 1000
    set seed 368
    set obs `mc'
    gen data_store_x = .
    gen data_store_cons = .
    quietly {
    forvalues i = 1(1) `mc' {
    if floor((`i'-1)/100) == (`i' -1)/100 {
    noisily display "Working on `i' out of `mc at $S_TIME'"
    }
    preserve

    clear

    set obs = 1000

    gen x = rnormal() *3 + 6

    gen e = runiform() - 0.5

    gen y = 3 + 4*x + e

    reg y x, robust

    local xcoef = _b[x]
    local const = _b[_cons]

    restore

    replace data_store_x = `xcoef' in `i'
    replace data_store_cons = `const' in `i'
    }
    }
    summ data_store_x data_store_cons

  • #2
    You're really almost there. The modifications needed are minor:
    Code:
    clear
    
    local mc = 1000
    set seed 368
    set obs `mc'
    gen data_n = .
    gen data_store_x = .
    gen data_store_cons = .
    foreach n of numlist 10 100 1000 10000 {
        quietly {
            forvalues i = 1(1) `mc' {
                if floor((`i'-1)/100) == (`i' -1)/100 {
                    noisily display "Working on `i' out of `mc' at $S_TIME" // CHANGE HERE IS CORRECTION OF ERROR IN ORIGINAL, NOT MODIFICATION TO ITERATE N.
                }
                preserve
    
                clear
    
              set obs `n'
    
                gen x = rnormal() *3 + 6
    
                gen e = runiform() - 0.5
    
                gen y = 3 + 4*x + e
    
                reg y x, robust
    
                local xcoef = _b[x]
                local const = _b[_cons]
    
                restore
    
              replace data_n = `n' in `i'
                replace data_store_x = `xcoef' in `i'
                replace data_store_cons = `const' in `i'
                }
        }
        summ  data_n data_store_x data_store_cons
    }
    In the future when posting code here, please use code delimiters, so that it displays readably. It took me longer to fix up the formatting of your code so I could read what you are doing than to write and run the solution to your question. If you are not familiar with code delimiters, see Forum FAQ #12.
    Last edited by Clyde Schechter; 29 Sep 2022, 18:42.

    Comment


    • #3
      Thank you Clyde, I will have to look at the FAQ, as I am new to Stata and this forum. Just a follow up question if I may; so the "foreach" command allows for the interior loop to replace the n with each of the observation amounts?

      Comment


      • #4
        Yes, that's exactly what -foreach- does. Do read -help foreach- and the manual section linked (in blue near the top) there. -foreach- is one of the bedrock commands in Stata, a must-know to do anything non-trivial.

        Comment


        • #5
          I have another follow up question; when I run the code, I see that the number of observations doesn't iterate from 10 to 100 to 1000 to 10000 as I am intending to do. Is this because `mc' is set to 1000 outside of the loop? My intention is to run a Monte Carlo simulation with 1000 replications over a random set of normally distributed observations that star at 10 and end at 10000, all with the intention to prove the law of large numbers.

          Comment


          • #6
            I see that the number of observations doesn't iterate from 10 to 100 to 1000 to 10000 as I am intending to do. Is this because `mc' is set to 1000 outside of the loop?
            Yes, the number of observations does iterate from 10 up to 10,000. I think you are misreading the output.

            `mc' has nothing to do with the number of observations: `mc' is the number of replications in the Monte Carlo simulation. The number of observations in each regression is given by data_n, which comes from `n'. See the ouput I get from this code:

            Code:
            Working on 1 out of 1000 at 11:39:37
            Working on 101 out of 1000 at 11:39:39
            Working on 201 out of 1000 at 11:39:40
            Working on 301 out of 1000 at 11:39:41
            Working on 401 out of 1000 at 11:39:42
            Working on 501 out of 1000 at 11:39:43
            Working on 601 out of 1000 at 11:39:44
            Working on 701 out of 1000 at 11:39:45
            Working on 801 out of 1000 at 11:39:46
            Working on 901 out of 1000 at 11:39:47
            
                Variable |        Obs        Mean    Std. dev.       Min        Max
            -------------+---------------------------------------------------------
                  data_n |      1,000          10           0         10         10
            data_store_x |      1,000    4.000318    .0354492    3.83407   4.140077
            data_store~s |      1,000    3.003661    .2356695   2.079484   4.029168
            Working on 1 out of 1000 at 11:39:48
            Working on 101 out of 1000 at 11:39:50
            Working on 201 out of 1000 at 11:39:51
            Working on 301 out of 1000 at 11:39:52
            Working on 401 out of 1000 at 11:39:53
            Working on 501 out of 1000 at 11:39:55
            Working on 601 out of 1000 at 11:39:56
            Working on 701 out of 1000 at 11:39:57
            Working on 801 out of 1000 at 11:39:58
            Working on 901 out of 1000 at 11:39:59
            
                Variable |        Obs        Mean    Std. dev.       Min        Max
            -------------+---------------------------------------------------------
                  data_n |      1,000         100           0        100        100
            data_store_x |      1,000    3.999688    .0099713   3.964951   4.032044
            data_store~s |      1,000    3.002022    .0668583   2.742136   3.217459
            Working on 1 out of 1000 at 11:40:01
            Working on 101 out of 1000 at 11:40:02
            Working on 201 out of 1000 at 11:40:04
            Working on 301 out of 1000 at 11:40:05
            Working on 401 out of 1000 at 11:40:06
            Working on 501 out of 1000 at 11:40:08
            Working on 601 out of 1000 at 11:40:09
            Working on 701 out of 1000 at 11:40:10
            Working on 801 out of 1000 at 11:40:12
            Working on 901 out of 1000 at 11:40:13
            
                Variable |        Obs        Mean    Std. dev.       Min        Max
            -------------+---------------------------------------------------------
                  data_n |      1,000        1000           0       1000       1000
            data_store_x |      1,000     3.99988    .0030899    3.99082   4.010262
            data_store~s |      1,000    3.000811    .0211018    2.92852   3.061063
            Working on 1 out of 1000 at 11:40:15
            Working on 101 out of 1000 at 11:40:16
            Working on 201 out of 1000 at 11:40:18
            Working on 301 out of 1000 at 11:40:20
            Working on 401 out of 1000 at 11:40:22
            Working on 501 out of 1000 at 11:40:24
            Working on 601 out of 1000 at 11:40:26
            Working on 701 out of 1000 at 11:40:27
            Working on 801 out of 1000 at 11:40:29
            Working on 901 out of 1000 at 11:40:31
            
                Variable |        Obs        Mean    Std. dev.       Min        Max
            -------------+---------------------------------------------------------
                  data_n |      1,000       10000           0      10000      10000
            data_store_x |      1,000           4    .0009808   3.997231   4.003786
            data_store~s |      1,000    2.999952    .0066836   2.978674   3.020268
            ...all with the intention to prove the law of large numbers.
            Simulations can't prove anything. They can illustrate how the law of large numbers works in practice. And these results do show how the range of sampled estimates of data_store_x and data_store_cons gets narrower as the sample size (data_n) increases (though with markedly diminishing returns, as expected).

            Added: As an aside,also notice that the amount of time it takes to 1000 replications of the regression increases as data_n goes up.
            Last edited by Clyde Schechter; 30 Sep 2022, 12:49.

            Comment

            Working...
            X