Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • nonlinear difference-in-differences with non-integer interaction term

    Hi everyone !

    I'm currently working on the effect of the german minimum wage of 2015 (8.5 EUR per hour) using a nonlinear difference-in-differences model using panel data of around 15'000 observations. More precisely i'm interested in the effect of a minimum wage on the employment retention probability. My treatment group consists of the workers whose wage will have to increase in order to comply with the new minimum wage (<8.5 EUR), while my comparison groupe is composed by the people who earn a wage slightly above the minimum wage (between 8.5 and 12.5). My strategy consists of observing the probability that a worker is still employed in period "t+1" conditional of the fact that he was employed period "t" following the approach of Stewart (2004) https://onlinelibrary.wiley.com/doi/...3.2003.00200.x .The dependent variable "employed" is a binary variable taking 1 if the individual is employed and 0 if he's unemployed.

    The variable which defines the treated group is not a binary variable taking the value 0 or 1 but a continuous non-integer variable. This variable is composed by the gap of the treated workers' wage and the minimum wage of 8.50. This means that the variable treated_gap can take the value 0 for the people who have a wage above the minimum (comparison group) and a positive non-integer value for the workers whose wage have to increase to comply with the minimum wage law (treated group).

    The problem that I ran into is that a cannot compute the marginal effect with the command margins, since the variable treated_gap is composed by non-integer values. My question is the following : How can I compute the marginal effect of my treatment effect with this continuous non-integer value ?

    My version of Stata is Stata 14

    Code:
    clear all
    use sample_1315.dta
    
    *indicator for each year
    gen year2013 =(welle==2013)
    gen year2014 =(welle==2014)
    gen year2015 =(welle==2015)
    
    
    *** Variable "gap"
    gen gap=  8.5 - hourly_wage_contract  if hourly_wage_contract < 8.5
    replace gap=0 if (gap ==.)
    label variable gap "gap between min. wage and wage if under minimum, 0 otherwise"
    
    *generate diff-in-diff variable
    gen time = (welle>=2015) & !missing(welle)
    label variable time " 1= after reform 0= before reform"
    
    gen treated = (hourly_wage_contract <8.5)   & !missing(hourly_wage_contract)
    gen treated_gap= treated*gap       /*  generate treated variable wich is the product of the gap and the variable telling if worker's wage is below 8.5 (=1) or above (=0)    */
    label variable treated_gap " gap of people eligible to minimum wage"
    gen interaction= treated_gap*time  /*  generate the interaction term which is the diff-in-diff estimator   */
    
    /* only with year controls*/
    logit employed  c.treated_gap i.time c.interaction   i.year2013  if hourly_wage_contract < 12.5 , vce(cluster persnr)  nolog
    
    
    
    
    Logistic regression                             Number of obs     =      4,403
                                                    Wald chi2(4)      =     186.05
                                                    Prob > chi2       =     0.0000
    Log pseudolikelihood = -577.42794               Pseudo R2         =     0.1351
    
                                 (Std. Err. adjusted for 2,196 clusters in persnr)
    ------------------------------------------------------------------------------
                 |               Robust
        employed |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     treated_gap |  -.4643561   .0401438   -11.57   0.000    -.5430365   -.3856757
          1.time |   .6051741   .3310308     1.83   0.068    -.0436343    1.253983
     interaction |  -.0334422   .0880294    -0.38   0.704    -.2059766    .1390921
      1.year2013 |  -.4097615   .1578618    -2.60   0.009     -.719165    -.100358
           _cons |   4.094164   .1802332    22.72   0.000     3.740913    4.447414
    ------------------------------------------------------------------------------

    I'm aware that the p-value is here huge, but what i'm interested in in this example is the right command to compute the marginal effect for the treated group after the reform.

    If the variable which defines the treated group would have been binary , I would have used this command

    Code:
    margins, dydx(interaction)  at(treated_gap==1 time==1)
    
    Average marginal effects                        Number of obs     =      4,403
    Model VCE    : Robust
    
    Expression   : Pr(employed), predict()
    dy/dx w.r.t. : interaction
    at           : treated_gap     =           1
                   time            =           1
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     interaction |  -.0005702   .0013936    -0.41   0.682    -.0033016    .0021612
    But using this gives be the effect in period 1 (post reform) only for the treated with a wage gap of 1. But the problem is that the wage gap can take a large number of values composed by non-integer values

    Kind regards

    Xavier
    Last edited by Xavier Chassot; 14 Apr 2019, 08:29.

  • #2
    Well, this is ill-conceived. You are trying to calculate the marginal effect of a variable that by its very nature does not have a marginal effect. The interaction term is, itself, a marginal effect! Had you properly used factor variable notation instead of hand-coding your interaction variable, Stata would have recognized this and refused to even try to calculate such a unicorn. But your code obfuscated the fact that the variable you call interaction is, in fact, an interaction term: Stata does not understand the natural language meanings of variable names. So Stata went ahead and calculated a meaningless statistic for you.

    Here's how I would have done this:

    Code:
    logit employed c.treated_gap##i.time  i.year2013 if hourly_wage_contract < 12.5 , vce(cluster persnr) nolog
    margins, dydx(time) at(treated_gap = (list of values of treated_gap))
    marginsplot
    You replace list of values of treated_gap by a list of numbers that reasonably spans the range of usual or interesting values of the treated_gap variable. The -margins- command will then give you the marginal effect of time, which is the expected difference between before and after 2015 in the probability of being employed, at those various values of treated_gap. The graph will visualize this for you. You might also want to run
    Code:
    margins time, at(treated_gap = (same list of values of treated_gap))
    marginsplot
    to see the trends in employment rates at different values of treated_gap before and after 2015.

    By the way, your definition of the time variable counts observations with missing values for welle as being, in effect, before 2015. Is that correct?



    Comment


    • #3
      Thank you very much Mr. Schechter for your quick and satisfying reply as well as the helpfull suggestion. Concerning the definition of the time variable, there aren't any missing values for the variable "welle" in my entire dataset.

      Comment


      • #4
        Concerning the definition of the time variable, there aren't any missing values for the variable "welle" in my entire dataset.
        I imagined that would be your response. But the fact that you coded -gen time = (welle>=2015) & !missing(welle)- put doubt in my mind.

        Comment

        Working...
        X