Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • log transformation for outcome variable for difference in differences analysis

    Hello I am a beginner on Stata and I am interested about the impact of restaurants closure on the covid-19 cases that why I log transformed my outcome variable daily confirmed cases used: reg log(daily_cases) treat post treat*post as a naïve difference in differences. after the log transformation I had missing values that's why I used log(daily cases +1) and with this I had robust and same coefficients for treat_post in DiD naive model and panel data time and individual fixed effects. Do this procedure is good enough or not .. because my outcome variable at the base (daily cases) is skewed to the left and divided in two parts even the log_daily_caseslog_cases.gph

  • #2
    Do not log your data!

    Comment


    • #3
      Daily cases will not be skewed to the left. The terminology for skewness may be perverse in some minds but it is standard.

      Daily cases surely have a lower limit of zero and a long tail of positive values.

      The presumption is that you are looking at something like a histogram with horizontal magnitude axis. Then which is the longer tail? If it's the right tail you have right skewness; if it is the left tail you have left skewness. If you can't tell, don't use either term, even informally. (One tail can be absent, in which case it has zero length.)

      This terminology will seem backward if you prefer to think about skewness as defined in terms of where is most of the probability, towards the left or towards the right of a probability distribution, but that's the terminology.

      In many ways it's much better to think in terms of positive and negative skewness. There are many measures of skewness, but just about any that you might encounter now yields a positive result with right skewness and a negative result with left skewness. There is small print to qualify that, but I stop there.

      this procedure is good enough or not
      How will be the judge? If this is an assignment, ask locally. Using logarithms and then finding that zeros can't be included in the analysis is I guess wrong by anyone's standard. Otherwise log(count + 1) divides the world, some researchers finding it an acceptable fudge and others hating it on all sorts of grounds.

      Comment


      • #4
        These recent papers may be useful to consult:

        https://www.jonathandroth.com/assets...HOD0_Draft.pdf

        https://uwmadison.box.com/s/l9cyqepj...5wd8lah6uiyvs2

        Comment

        Working...
        X