Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create a variable from means.

    Hi all. I am using Stata 17. I am not good at coding. You can say I am a novice.
    This is just for an academic exercise. I am actually trying to prove the central limit theorem and calculate standard error. I use this code:
    Code:
    clear
    set obs 1000
    gen var = runiform()
    save data
    bsample 50
    save sample_data
    clear
    forvalues i = 1/1000 {
        use data
        bsample 50
        collapse var
        save sample_`i'
    }
    clear
    forvalues i = 1/1000 {
            append using sample_`i'
    }
    save SAMPLE
    I can feel that this is an inefficient way to solve the problem. Here's what I intend to do.
    1. Create a uniformly distributed variable with 1000 observations.
    2. Take 1000 samples with replacement of size 50.
    3. Calculate the mean of each sample in step 2.
    4. Store the means in a dataset.

    I am looking for an efficient code to accomplish this.

  • #2
    The disk overhead of saving a million files each with only 1 number might be excessive. With only a 1000, it probably isn't worth any concern. However, if you do want to scale up the problem, you could use -file open- and -file write- to write only the means, and then -insheet- to read the result. Or were you worrried about some other inefficiency?

    Code:
    file open foo using foo.raw,write
    ...
    file write foo (var) _n
    ...
    file close
    ...
    insheet using foo

    Comment


    • #3
      By using the code in #1, I need to save 1000 datasets (each with just one observations) in my directory, then append each into a single datafile. What I want is to somehow do this without creating the datasets. For example, I could save the mean of each bsample as a local:
      Code:
       
       forvalues i = 1/1000 {  
       use data  
       bsample 50  
       sum var  
       local mean_`i' = r(mean)  
       }
      However, I don't know how to make a variable from these 1000 locals.

      Comment


      • #4
        There is no need to save any data sets along the way at all. And working with 1,000 local macros will be unwieldy. You can do this very efficiently and easily using -frame-s assuming you are running version 16 or later.
        Code:
        clear*
        
        set obs 1000
        set seed 1234
        
        gen var = runiform()
        save data, replace
        
        frame create sample_means int rep float mean_value
        
        local nreps 1000
        forvalues i = 1/`nreps' {
            use data, clear
            bsample 50
            summ var, meanonly
            frame post sample_means (`i') (`r(mean)')
        }
        
        frame change sample_means
        histogram mean_value
        summ mean_value
        If, when you are done you decide you want to save a single file containing all 1,000 sample means you can do that by just adding -save all_the_means, replace- to the end of this code.

        Comment


        • #5
          Thank you Clyde Schechter
          This was great help. Though I need to read about "frame" to understand the code. For now, the code does exactly what I needed.

          Comment

          Working...
          X