Placebo/random number generation and set seed

Felix Noth

Join Date: Aug 2018

Posts: 2
#1

Placebo/random number generation and set seed

17 Aug 2018, 02:36

Dear all:

What I like to do is, to run placebo regressions for a difference in difference regression (y=b0+b1*event+b2*treatment+b3*(treatment*event)+e - for which I get a highly sig. b3 coefficient) in which I want to distribute the treatment variable randomly on the individuals (to show that the real distribution of the treatment status is driving my results, not some process in the back that might coincide with the treatment).

To do this I use the mean of the treatment variable

sum treatment, meanonly

and then generate random numbers based on this mean - I generate 1000 treatment variables:

set seed 123456789
forval i=1(1)1000{
qui: gen treat`i'=runiform()<=`r(mean)'
}

Then I run 1000 regressions in which I use each of the randomly generated treated variables and count the number of estimates for the difference in difference effect b3 that are significant at the 5% level. When I do it as above, I get 63 out of 1000 which is a bit above the 5% level which might question whether it is really the real treatment variable that is generating the results or some other factor not detected in the regression.

Now come the questions!

When I do exactly the same but change the set seed with every simulation run

local run 123456788
forval i=1(1)1000{
local ++run
set seed `run'
qui: gen treat`i'=runiform()<=`r(mean)'
}

and then run the 1000 placebo regression, I get only 51 out of 1000 coefficients that are significant; which are exactly the number you expect taking the 5% sig. threshold.

So my question is: why is that and what is the better (the correct) approach to generate the placebo variables here.

Thanks in advance

Best

Felix
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

17 Aug 2018, 07:07

The -permute- command will randomly permute the values of one or more variable, run a statistical command, and save whatever results you choose. It sounds to me like that would fit what you want, though I am not familiar with the terminology "placebo regression."
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

17 Aug 2018, 07:12

An answer that's not an answer: read the output of help set seed, specifically the section headed "Do not set the seed too often".

This suggests that your second approach is inappropriate technique, regardless of the results it yielded in your particular example. Do note that you have an experiment comparing two processes
choosing a number, initializing the seed that number, and running 1000 tests

choosing a number, running 1000 tests each time reinitializing seed sequentially to a successive value of the number originally chosen

and you are comparing the results of one run of each process. That is not an adequate sample of data about your two processes to permit you to draw any conclusion about the relative merits of the two processes. Perhaps if you ran the processes 100 times with a different number chosen each time, something could be said ... most likely that the differences occur totally by chance.
Comment
Felix Noth

Join Date: Aug 2018

Posts: 2
#4

18 Aug 2018, 02:19

Thank you both. I'll take the hint with the "not setting the set seed command too often" and thereby stick to my first alternative. As a follow up: do you think that finding significant coefficients in 6.3% of the case (given the 95% band, it should only be 5%) is problematic. Are there rules of thumbs here.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

18 Aug 2018, 05:14

I have no experience with the placebo regression technique you describe on which to base a contribution. A search of the Statalist archives shows that it pops up in coordination with difference-in-difference analyses, which is not a technique I deal with.

Now that we've gotten by your original question, which emphasized random number generation, perhaps you should post a new topic with a title along the lines of "Interpreting the results of placebo test for difference-in-difference models", which might draw more interest, especially among those who who use DiD.
Comment

Announcement

Placebo/random number generation and set seed

Comment

Comment

Comment

Comment