Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to simulation standard deviation and confidence intervals for residual?

    Hello, I am trying to use simulation to verify random assignment. For instance, I want to know whether gender composition is random across classes within school-grade. The idea is as follows.

    I conduct Monte Carlo simulations to check whether the observed within school-grade deviations in female proportion are consistent with variation stemming from a random process. For each school-grade, I randomly designate a student as female in each class using a binomial distribution function with p equal to the average proportion of female in the school-grade across all classes.
    I then compute the within school-grade standard deviation, using the residuals from regressions of female proportion on school-grade fixed effects. I repeat this process 1,000 times to obtain a 95% empirical confidence interval for the standard deviations in female proportion for each school.

    My code is as follows but I am not sure if I am doing the right thing to obtain residual and don’t know how to obtain 95% empirical CI for the standard deviations with program and simulate command.

    Any suggestions are deeply appreciated.


    Code:
    program binomial, rclass
        drop _all
        use “simulation_gender” //this is the data where I have schoolID, classID, school_grade(indicator for each school-grade group), p is the average proportion of female in the school-grade across all classes.
        gen z = rbinomial(100, p) // randomly designate a student as female using a binomial distribution function with p
            bys classID: egen z_count = total(z==1)
            bys classID: egen z_classize=total(z~=.)
            gen z_ratio = z_count/(z_classize-1) if z==0 //z_ratio is the leave-one-out female proportion within class
            replace z_ratio = (z_count-1)/(z_classize-1) if z==1
            replace z_ratio = . if z==.
        qui areg z_ratio, a(sch_grade)  //compute the within school-grade standard deviation
        predict resid, residuals
        sum resid
        return scalar mean = r(mean)
        return scalar sd = r(sd)
    end
    
    simulate mean=r(mean) sd=r(sd), reps(1000)  seed(1234): binomial
    
    summarize
    Data looks like this
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(classID gradeID schoolID school_grade) float(school_grade p)
    1 5 1 1 .501
    1 4 1 2 .546
    2 4 1 3 .523
    2 4 1 3 .523
    end
    Last edited by Yi Luo; 14 Apr 2020, 02:39.

  • #2
    Any idea, please?

    Comment

    Working...
    X