Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 50th Percentile Matrix not storing values

    Looking to store my 50th percentile values in a matrix that loops for n= 10, 100, 1000, 10000 observations. I have tried moving the matrix round a few times, thinking that it might have been in the wrong place, but it still doesn't run. I have highlighted the relevant sections for ease of understanding. Any insights are gladly appreciated.
    Code:
    clear
    
    local mc = 1000
    set seed 368
    set obs `mc'
    gen data_n = .
    gen data_store_x = .
    gen data_store_cons = .
    gen data_store_std_err = .
    gen data_percentile_fifty = .
    
    mata: beta_10 = J(1000,1,.)
    mata: beta_100 = J(1000,1,.)
    mata: beta_1000 = J(1000,1,.)
    mata: beta_10000 = J(1000,1,.)
    
    mata: std_err_10 = J(1000,1,.)
    mata: std_err_100 = J(1000,1,.)
    mata: std_err_1000 = J(1000,1,.)
    mata: std_err_10000 = J(1000,1,.)
    
    foreach n of numlist 10 100 1000 10000 {
        quietly {
            forvalues i = 1(1) `mc' {
                if floor((`i'-1)/100) == (`i' -1)/100 {
                    noisily display "Working on `i' out of `mc' at $S_TIME"
                }
                preserve
    
                clear
    
              set obs `n'
    
                gen x = rnormal() *3 + 6
    
                gen e = rnormal() - 0.5
    
                gen y = 3 + 4*x + e
                
                reg y x, vce(robust)
    
                local xcoef = _b[x]
                local const = _b[_cons]
                local std_err = _se[x]
                local percentile_fifty = r(p50)
                
                mata: beta_`n'[`i',1] = `xcoef'
                mata: std_err_`n'[`i', 1] = `std_err'
               
                restore
             
                
                replace data_n = `n' in `i'
                replace data_store_x = `xcoef' in `i'
                replace data_store_cons = `const' in `i'
                replace data_store_std_err = `std_err' in `i'
                replace data_percentile_fifty = `percentile_fifty' in `i'
                    }
        
            }
    summ data_n data_store_x data_store_std_err data_store_cons data_percentile_fifty, detail
    mata: data_percentile_fifty_`n'[`i',1] = `percentile_fifty'
    }

  • #2
    Fiftieth percentile of what? A regression doesn't have a fiftieth percentile (or any percentile). It produces coefficients standard errors, some fit statistics, a sample size, some test statistics--but no percentiles of anything.

    So when you run -local percentile_fifty = r(p50)-, r(p50)does not exist. So local macro percentile_fifty will just contain an empty string.

    Perhaps you are confused about what you are trying to do. While each regression produces no percentiles, the ensemble of 1000 regressions can be summarized to produce 50th percentiles of the coefficients, or of the standard errors, or of the constant terms. Which of those do you want? You can get that with -summarize, detail- but it will produce only a single number, not a vector with 1,000 rows, and the place to do that is inside the -foreach n- loop but after the close of the -forvalues i- loop.

    Comment


    • #3
      What I want to do is : across each replication calculate the squared difference between the empirical percentile and the theoretical percentile of the standard normal distribution at the 25th, 50th, and 75th percentile. Perhaps I do not require a matrix, though I do have another question: My code is establishing an empirical percentile and not a theoretical percentile, correct?

      Comment


      • #4
        calculate the squared difference between the empirical percentile and the theoretical percentile of the standard normal distribution at the 25th, 50th, and 75th percentile.
        Empirical percentile of what? The only regression output I can think of that would be predicted to have a standard normal distribution is the z-statistic, and that only when the null hypothesis is true.

        And yes, the simulation establishes empirical statistics; only theory can establish theoretical statistics.
        Last edited by Clyde Schechter; 03 Oct 2022, 19:52.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          Empirical percentile of what? .
          That would be of the empirical and theoretical Cumulative Distribution Function.

          Comment


          • #6
            I'm sorry I don't know how to make this clearer so you can provide an answer. The empirical and theoretical cumulative distribution of what?

            Comment


            • #7
              If you want the 25th 50th and 75th percentiles of the empirical cumulative distribution of a variable v, you can get them with -summ v, detail-, and the results will be in r(p25), r(p50), and r(p75), respectively. You can then save those in scalars or local macros and use them in calculations (such as comparison to the values predicted by some theoretical distribution) or store them in variables.

              But you still have to figure out what variable v to use for this. That's what I've been asking about, and you have not responded.

              If the theoretical distribution you are trying to compare to is the standard normal distribution, then corresponding percentiles are invnorm(0.25), invnorm(0.50), and invnorm(0.75), respectively.

              But, as I say, none of the variables you have created so far is predicted to have a standard normal distribution, so I remain puzzled as to what variable (presumably, one you have yet to create) is involved here.

              Comment

              Working...
              X