Calculating power using Stata simulations

austin ryan

Join Date: Jan 2016

Posts: 10
#1

Calculating power using Stata simulations

10 Aug 2017, 12:44

A reference to what I'd like to accomplish:

http://www.stata.com/support/faqs/st...by-simulation/

I think the easiest way to do this is to give a brief description of what I hope to achieve:

I want to test whether or not a policy change at an institution has an impact on outcomes for people in an institution. Let us assume there are N possible institutions in our universe. Some portion of those N institutions, say αN, α∈(0,1), implement this policy change of interest. Assume these institutions implement this policy change at (randomly) staggered intervals (and implementation is i.i.d between institutions).

How could I simulate data to determine whether or not this policy change significantly impacts outcomes for people working in these αN institutions using a generalized DiD approach?

Last edited by austin ryan; 10 Aug 2017, 12:54. Reason: added tags
Tags: difference-in-difference, power analysis, simulation

Joseph Coveney

Join Date: Apr 2014
Posts: 4421

11 Aug 2017, 00:52

I don't know what a generalized DiD approach is, but you could start with something like that below, at least as far as generating the datasets as you describe.

Code:

version 15.0

clear *
set seed 1405861

program define simem // , rclass
    version 15.0
    syntax , [n(integer 50) DIvvyup(real 0.5) Epochs(integer 12) ///
        DElta(real 0) Residual(real 1)]

    // Add input validation to taste

    drop _all
    set obs `n'
    generate int pid = _n
    generate double u = rnormal()
    generate byte trt = runiform() > `divvyup'

    // At least one pre- and one posttreatment observation in treated participants
    generate int onset = cond(trt, runiformint(2, `epochs' - 1), .)

    expand `epochs'
    bysort pid: generate byte time = _n
    replace trt = time >= onset

    generate double response = u + trt * `delta' + runiform(0, `residual')

    xtset pid time
    xtreg response i.time##i.trt, re
    test 1.trt
end

foreach delta of numlist 0 0.2 0.25 0.3 {
    simulate p = r(p), reps(100) nolegend nodots: simem , de(`delta')
    generate byte pos = p < 0.05
    summarize pos, meanonly
    display in smcl as text "Delta = " %4.2f `delta' " Power = " %4.2f r(mean)
}

exit

I've varied the assumed delta, holding the sample size fixed, because rumor has it that that's the circumstance under which most grant applications (and clinical protocols) are written. But you could do the converse.

Last edited by Joseph Coveney; 11 Aug 2017, 01:05.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#3

11 Aug 2017, 08:40

Joseph Coveney gives elegant and clean code for this problem.

I would just point out that -reps(100)- in the -simulate- command may be too few to give you sufficiently precise power estimates. For example, if you come up with a power of 0.8 based on 100 replications, the 95% CI around that is 0.70 to 0.87. That may leave you up in the air as to whether that particular configuration is satisfactory or not. For many purposes a power of 0.87 would be considered adequate, but 0.70 would be considered insufficient. So you might want to use a larger number of replications in that command. It will take longer to run, but your results will be more precise. FWIW, when I do this kind of thing I typically use 500 to 1000 reps depending on the circumstances.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#4

11 Aug 2017, 19:15

Agreed. I left number of replications small so that the user could see the illustration without waiting too long.

I usually do power analysis in two stages. First with 100 to 300 replications per sample size proposal in order to get into the ballpark. And then with 1000 (continuous response variable / smooth power plot) to 3000 (binomial response variable / saw-tooth power plot) for the record.

Reverse the inequality on the treatment group assignment:

Code:

generate byte trt = `divvyup' > runiform()
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#5

12 Aug 2017, 01:38

And:

Code:

generate double response = u + trt * `delta' + rnormal(0, `residual')

Sheesh.
Comment
Yulia Marchenko (StataCorp)

StataCorp Employee

Join Date: Mar 2014

Posts: 35
#6

18 Aug 2017, 15:25

We can take this a step further and turn Joseph's simulation code into its own program. The advantage of this is that we can then use this program within Stata's power command to produce tables and graphs automatically.

For example, we can copy Joseph's code inside the foreach loop (with a few modifications) and put it in a program.

Code:

program power_cmd_mysimem, rclass version 15 syntax , Reps(integer) n(integer) /// [ DElta(real 0) Alpha(real 0.05) * ] // compute power by simulation qui simulate p = r(p), reps(`reps') nolegend nodots: /// simem , n(`n') de(`delta') `options' qui generate byte pos = p < `alpha' qui summarize pos, meanonly // store scalars in r() as expected by power return scalar power = r(mean) return scalar alpha = `alpha' return scalar N = `n' return scalar delta = `delta' end

power requires that the program follow the naming convention power_cmd_methodname. You can use anything for methodname, except names of official power methods. I suggest to avoid picking "nice" names that may later be used by Stata's official methods. In the above, methodname is mysimem.

power also requires that the method program support power's common options such as n(), alpha(), and so on and return basic power results in specific scalars such as sample size in r(N), power in r(power), significance level in r(alpha), and so on.

We can now type

Code:

. set seed 123 . power mysimem, reps(10) n(50) delta(0)

and obtain the following results:

Code:

Estimated power Two-sided test +-------------------------+ | alpha power N | |-------------------------| | .05 0 50 | +-------------------------+

I used 10 replications for demonstration only.

With a little more effort, we can get even more out of power. We can instruct power that option delta() should allow a numlist. We can also request that column delta is displayed in a table. This is done by creating an initializer program for the mysimem method.

Code:

program power_cmd_mysimem_init, sclass version 15 sreturn clear sreturn local pss_numopts "DElta" sreturn local pss_colnames "delta" end

To allow the specification of a numlist in an option, we list the option name as it appears in the syntax diagram in s() local pss_numopts. To add the delta column to the default table, we list the name of the stored scalar containing the value of delta in s() local pss_colnames.

We can now type

Code:

. set seed 123 . power mysimem, reps(10) n(50) delta(0 0.2 0.25 0.3)

and obtain the following table:

Code:

Estimated power Two-sided test +---------------------------------+ | alpha power N delta | |---------------------------------| | .05 0 50 0 | | .05 .6 50 .2 | | .05 .9 50 .25 | | .05 .9 50 .3 | +---------------------------------+

We can even plot the estimated power by merely specifying power's graph option.

Code:

. set seed 123 . power mysimem, reps(10) n(50) delta(0 0.2 0.25 0.3) graph

You can read more about adding your own methods to power and see more examples in [PSS] power usermethod in Stata 15 (http://www.stata.com/manuals/psspowerusermethod.pdf). Also see http://www.stata.com/new-in-stata/ad...-size-methods/.

(Stata 14 users, see help power userwritten.)
1 like
Comment

Announcement

Calculating power using Stata simulations

Comment

Comment

Comment

Comment

Comment