Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Propensity score matching difference-in-difference via Stuart et al. 2014 methodology

    Hi all,

    I am trying to evaluate the effect of a university teaching reform on the wage outcome of its students using PSM-DiD via the methodology proposed by Stuart et al. 2014 - Using propensity scores in difference-in-differences models to estimate the effects of a policy change. (https://pubmed.ncbi.nlm.nih.gov/25530705/)

    Considering four groups:
    1 - Treated in pre-treatment period
    2 - Control in pre-tratment period
    3 - Treated in post-treatment period
    4 - Control in post-treratment period

    ...they suggest to "fit a multinomial logistic regression predicting Group as a function of a set of observed covariates X. Each individual will have four resulting propensity scores, ek(Xi): the probability of being in Group k, for k = 1–4. (Note that these four will sum to one for each individual). The weights are then created in such a way that each of the four groups is weighted to be similar to Group 1, the treatment group in the pre period. This is accomplished using the following weight for individual i:

    wi = e1 / eg (Xi)

    where g refers to the group that individual i was actually in."

    I try to do this by writing

    Code:
    forval x = 1/4{
    mlogit group "list of covariates", baseoutcome(`x')
    predict psa`x'
    My problem is that psa1=psa2=psa3=psa4!=1, where I instead want psa`x' to be the probability of the individual to be in group x. Apparently, I can solve it by instead estimating four "single"logistic regression for each of the four groups, but I am not sure if that is econometrically correct, since Stuart et al. suggest to estimate the probability via multinomial logistic regression.

    I cannot show you my real data, as it is confidential, but using the example hospital data from the helpfile to 'didregress' this is what I try:

    Code:
    use https://www.stata-press.com/data/r17/hospdd, clear
    
    gen group= 1 if inrange(month, 1, 3) & inrange(hospital, 1, 18)
    replace group=2 if inrange(month, 1, 3) & inrange(hospital, 19, 46)
    replace group=3 if inrange(month, 4, 7) & inrange(hospital, 1, 18)
    replace group=4 if inrange(month, 4, 7) & inrange(hospital, 19, 46)
    label define group 1 "pre-treated" 2 "pre-control" 3 "post-treated" 4 "post-control"
    label values group group 
    
    *multiple logit
    
    forval x = 1/4{
        mlogit group i.frequency, baseoutcome(`x')
        predict psa`x'
    }
    
    gen psa = psa1+psa2+psa3+psa4
    tab psa 
    
    
    *single logit
    gen pretreated=0
    replace pretreated=1 if group==1
    gen precontrol=0
    replace precontrol=1 if group==2
    gen posttreated=0
    replace posttreated=1 if group==3
    gen postcontrol=0
    replace postcontrol=1 if group==4
    
    logit pretreated i.frequency
    predict psb1
    logit precontrol i.frequency
    predict psb2
    logit posttreated i.frequency
    predict psb3
    logit postcontrol i.frequency
    predict psb4
    
    gen psb = psb1+psb2+psb3+psb4
    tab psb
    So my question is how I can correctly estimate the probability of being in each of the four groups defined above using multinomial logistic regression?

    I am aware that there is a user written package 'diff' which accomplishes PSM-DiD but this comes with the restriction of only being allowed to match via kernel-matching, which I am not interested in.

  • #2
    According to the description in the paper, the weight variable is computed as below.

    Code:
    mlogit group i.frequency
    predict p*
    gen weight = .
    forvalues i = 1/4 {
        replace weight = p1/p`i' if group == `i'
    }
    Then the DiD estimation can be implemented using a weighted linear regression.

    Comment


    • #3
      Originally posted by Fei Wang View Post
      According to the description in the paper, the weight variable is computed as below.

      Code:
      mlogit group i.frequency
      predict p*
      gen weight = .
      forvalues i = 1/4 {
      replace weight = p1/p`i' if group == `i'
      }
      Then the DiD estimation can be implemented using a weighted linear regression.
      Thanks for your concise answer, which solved my problem.

      Comment


      • #4
        Originally posted by Fei Wang View Post
        According to the description in the paper, the weight variable is computed as below.

        Code:
        mlogit group i.frequency
        predict p*
        gen weight = .
        forvalues i = 1/4 {
        replace weight = p1/p`i' if group == `i'
        }
        Then the DiD estimation can be implemented using a weighted linear regression.
        Please, I am also interested in this method but can I use the weight variable generated with the didregress command. I want to understand if using it will not create any violation or over-fitting considering didregress DiD approach. Thanks.

        Comment

        Working...
        X