Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • transforming log value so that zero value of the dependent variable can be dealt with

    transforming variable into log makes some values missing where the variable is showing zero for a county in a specific year. I'm running synthetic difference in difference regression. And, if the dependent variable is missing observation , in that case SDID doesn't run. I'm giving the data sample and code below. Here, I'm transforming the work variable to ln_work variable. This ln_work is my dependent variable.

    Can anyone tell me how I can transform the log value so that the missing observation doesn't show up and I can successfully run SDID.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float county int(year work) float(ln_work met)
    12003 2017 0 . 0
    12013 2008 0 . 0
    12013 2009 0 . 0
    12013 2010 0 . 0
    12013 2011 0 . 0
    12013 2012 0 . 0
    12013 2013 0 . 0
    12029 2019 0 . 0
    12037 2018 0 . 0
    12041 2001 0 . 0
    12041 2003 0 . 0
    12041 2007 0 . 1
    12041 2009 0 . 1
    12041 2010 0 . 1
    10001 2000  53  3.970292 0
    10001 2001  55 4.0073333 0
    10001 2002  59 4.0775375 0
    10001 2003  51 3.9318256 0
    10001 2004  53  3.970292 0
    10001 2005  40 3.6888795 0
    10001 2006  68 4.2195077 1
    10001 2007  58  4.060443 1
    10001 2008  45 3.8066626 1
    12003 2014   7   1.94591 0
    12003 2015   7   1.94591 0
    12003 2016  12  2.484907 0
    end


    Code:
    ssc install sdid 
    
    sdid ln_work county year met , vce(bootstrap) seed(100)

  • #2
    I apologize for not being able to properly posting the problem. To run SDID one needs to have a balanced panel data - which my sample is not a representative of. Except being balance my sample was alright.

    However, I'm taking this approach to get around the log value = . issue. Now, SDID is running successfully. However, I'm not entirely sure this is the correct way to get around the problem despite after going through this very insightful threads.

    https://www.statalist.org/forums/for...g-missing-data

    https://www.statalist.org/forums/for...rmed-variables

    Code:
    replace ln_work=0 if ln_work==.
    Last edited by Tariq Abdullah; 04 Jan 2023, 16:18.

    Comment


    • #3
      This is a common and sometimes contentious problem. I think the main argument should be substantive, what makes sense for your context as a data generation process. I can't comment there. I am not an economist and can't comment on DID analyses specifically.

      In some contexts the best solution is some kind of two-part model.

      In some contexts a good solution is a Poisson regression or more generally a generalised linear model with logarithmic link with key postulate that the mean function is positive which does not rule out some values being zero or even negative. That is "not get out of jail free" as you need to think that the main idea is fair of exp(Xb) as a functional form.

      Mapping x = 0 to log x = 0 is equivalent to treating x as 1. We can't tell how typical your one county example is of your wider dataset, but here's what you're doing with that data and here is a common solution log(y + 1), which sometimes works well, although why c == 1 in log(y + c) still needs a lot of discussion.

      Code:
      twoway function log(x), ra(7 68) || scatteri 0 0, msize(large) ytitle(cond(y == 0, 0, log(y))) xtitle(y) legend(off) xla(0 7 68) yla(0 `=ln(7)' `=ln(68)', ang(h)) name(G1)
      twoway function log(x + 1), ra(7 68) || scatteri 0 0, msize(large) ytitle(log(1 + y)) xtitle(y) legend(off) xla(0 7 68) yla(0 `=ln(8)' `=ln(69)', ang(h)) name(G2)
      Click image for larger version

Name:	transf_G1.png
Views:	1
Size:	16.3 KB
ID:	1695981

      Click image for larger version

Name:	transf_G2.png
Views:	1
Size:	15.1 KB
ID:	1695982



      That fudge may seem conservative, but the net result is a set of outliers. Whether that is right for your analysis brings us back to my start.
      Last edited by Nick Cox; 04 Jan 2023, 18:42.

      Comment


      • #4
        thanks so much Mr. Cox for putting your valuable thought in this. I've been blessed by past few threads where you also gave some much needed direction with the same problem others were facing. Thanks so much for your patience and kind consideration !

        Comment

        Working...
        X