Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Placebo/random number generation and set seed

    Dear all:

    What I like to do is, to run placebo regressions for a difference in difference regression (y=b0+b1*event+b2*treatment+b3*(treatment*event)+e - for which I get a highly sig. b3 coefficient) in which I want to distribute the treatment variable randomly on the individuals (to show that the real distribution of the treatment status is driving my results, not some process in the back that might coincide with the treatment).

    To do this I use the mean of the treatment variable

    sum treatment, meanonly

    and then generate random numbers based on this mean - I generate 1000 treatment variables:

    set seed 123456789
    forval i=1(1)1000{
    qui: gen treat`i'=runiform()<=`r(mean)'
    }

    Then I run 1000 regressions in which I use each of the randomly generated treated variables and count the number of estimates for the difference in difference effect b3 that are significant at the 5% level. When I do it as above, I get 63 out of 1000 which is a bit above the 5% level which might question whether it is really the real treatment variable that is generating the results or some other factor not detected in the regression.

    Now come the questions!

    When I do exactly the same but change the set seed with every simulation run

    local run 123456788
    forval i=1(1)1000{
    local ++run
    set seed `run'
    qui: gen treat`i'=runiform()<=`r(mean)'
    }

    and then run the 1000 placebo regression, I get only 51 out of 1000 coefficients that are significant; which are exactly the number you expect taking the 5% sig. threshold.

    So my question is: why is that and what is the better (the correct) approach to generate the placebo variables here.

    Thanks in advance

    Best

    Felix

  • #2
    The -permute- command will randomly permute the values of one or more variable, run a statistical command, and save whatever results you choose. It sounds to me like that would fit what you want, though I am not familiar with the terminology "placebo regression."

    Comment


    • #3
      An answer that's not an answer: read the output of help set seed, specifically the section headed "Do not set the seed too often".

      This suggests that your second approach is inappropriate technique, regardless of the results it yielded in your particular example. Do note that you have an experiment comparing two processes
      • choosing a number, initializing the seed that number, and running 1000 tests
      • choosing a number, running 1000 tests each time reinitializing seed sequentially to a successive value of the number originally chosen
      and you are comparing the results of one run of each process. That is not an adequate sample of data about your two processes to permit you to draw any conclusion about the relative merits of the two processes. Perhaps if you ran the processes 100 times with a different number chosen each time, something could be said ... most likely that the differences occur totally by chance.

      Comment


      • #4
        Thank you both. I'll take the hint with the "not setting the set seed command too often" and thereby stick to my first alternative. As a follow up: do you think that finding significant coefficients in 6.3% of the case (given the 95% band, it should only be 5%) is problematic. Are there rules of thumbs here.

        Comment


        • #5
          I have no experience with the placebo regression technique you describe on which to base a contribution. A search of the Statalist archives shows that it pops up in coordination with difference-in-difference analyses, which is not a technique I deal with.

          Now that we've gotten by your original question, which emphasized random number generation, perhaps you should post a new topic with a title along the lines of "Interpreting the results of placebo test for difference-in-difference models", which might draw more interest, especially among those who who use DiD.

          Comment

          Working...
          X