Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Dif-in-dif with right-skewed data

    Good morning to everybody,

    I am running a differences-in-differences to assess the impact of a policy on patenting. This policy targeted just specific economic sectors that are identifiable.
    For this reason, I used targeted economic sectors as the treatment group and all the other sectors as the control group.

    In this way, I obtained a panel with:

    (1) 30K sectors (2.5k of them are treated)

    (2) #patents per year per sector as dependent variable (i have 500K patents in total for the period 2006-2019)

    (3) post_policy that is a dummy =1 if the year is post-policy

    (4) treated_sector that is a dummy =1 if a sector is treated

    Using the commands xtset... and then xtreg... I find that the results are not statistically significant.

    For this reason, I would like to divide sectors into quartiles according to their performance (measured as patents) and then re-run the previously mentioned commands.
    What should I do, considering that my data is highly skewed on the right since there are many zeros (i.e. no patents for a given sector for a given year)?
    Does it make sense, otherwise, to remove outliers from the control groups?


    Thanks in advance

  • #2
    Lorenzo:
    welcome to this forum.
    Did you go -fe- or -re- with -xtreg-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello and thank you.
      I used the xtset and then xtdidregress functions. Also, my results don't coincide with those ones obtained with xtreg (I guess it is normal but I don't understand the difference between the two)

      Comment


      • #4
        Lorenzo:
        could you please share code and outcome tables of the two approaches? Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          There are several questions here and I avoid most of them on the grounds that just about all economists on Statalist, and some others, know more about differences-in-differences than I do.

          But at considerable risk of seeming dogmatic I suggest that there are two good reasons for removing outliers:

          1. A value is just mistaken -- impossible or utterly implausible -- as can be shown or argued independently and as can't be fixed.

          2. Careful reflection you would be happy to explain in public implies that certain data points are not relevant to your project.

          and one very, very bad reason

          3. The outliers are too awkward to handle for the analysis you desire.

          An outcome that is a count that can be zero and is right-skewed indicates to me some kind of Poisson regression.

          Comment


          • #6
            Carlo Lazzaro , thank you for your answer.

            The two codes and outcomes are below (they are referred to a subsample of my dataset).
            As you can see, P values for year*treatment (the combination of two dummies) change according to the code.


            >>> With the "official" panel dif-in-dif method:

            . xtset ipccode
            Panel variable: ipccode (balanced)

            . xtdidregress (patent) (year*treatment), group (ipccode) time (year)

            Number of groups and treatment time

            Time variable: year
            Control: yeartreatment = 0
            Treatment: yeartreatment = 1
            -----------------------------------
            | Control Treatment
            -------------+---------------------
            Group |
            ipccode | 1587 38
            -------------+---------------------
            Time |
            Minimum | 2006 2015
            Maximum | 2006 2015
            -----------------------------------

            Difference-in-differences regression Number of obs = 21,125
            Data type: Longitudinal

            (Std. err. adjusted for 1,625 clusters in ipccode)
            -------------------------------------------------------------------------------
            | Robust
            patent | Coefficient std. err. t P>|t| [95% conf. interval]
            --------------+----------------------------------------------------------------
            ATET |
            year*treatment |
            (1 vs 0) | -.1110498 .0372651 -2.98 0.003 -.1841425 -.0379572
            -------------------------------------------------------------------------------
            Note: ATET estimate adjusted for panel effects and time effects.


            >>> With the classic panel regression method:

            . xtreg patent dummyyear dummytreatment year*treatment

            Random-effects GLS regression Number of obs = 21,125
            Group variable: ipccode Number of groups = 1,625

            R-squared: Obs per group:
            Within = 0.0059 min = 13
            Between = 0.0006 avg = 13.0
            Overall = 0.0046 max = 13

            Wald chi2(3) = 116.69
            corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

            ----------------------------------------------------------------------------------------------------------------------
            patent | Coefficient Std. err. z P>|z| [95% conf. interval]
            ---------------+------------------------------------------------------------------------------------------------------
            dummyyear | .097161 .0090331 10.76 0.000 .0794563 .1148656
            dummytreatment | -.0193974 .0580371 -0.33 0.738 -.133148 .0943532
            yeartreatment | -.1110498 .0590709 -1.88 0.060 -.2268267 .004727
            _cons | .1188126 .008875 13.39 0.000 .1014178 .1362073
            ---------------+-------------------------------------------------------------------------------------------------------
            sigma_u | .29181783
            sigma_e | .59883424
            rho | .19190019 (fraction of variance due to u_i)
            --------------------------------------------------------------------------------



            THANK YOU IN ADVANCE

            Comment


            • #7
              Lorenzo:
              the estimators are really different. No wonder that you got different results.
              I would recommend you to abide by the most frequently reported in the literature of your research field.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Good afternoon,
                I am running a panel DiD regression.

                My question is, after having found a statistically significant estimation of the ATET, and having failed to reject the null hypothesis of parallel trends in the pre-treatment period, should I run the estat granger function to test possible anticipatory effects of the treatment? Also, if the granger test provides me with evidence such that I cannot reject the null hypothesis of the absence of anticipatory effects, what should I do to make further investigations?


                Thank you in advance

                Comment

                Working...
                X