No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Considering time differences in panel data

    Dear community,

    I want to regress var1 on var2 (and later var1 on var3)
    For my dependent var (var1) I have observations in year 1980 and year 2007.
    (For var3 as well)

    For var2 I have observations on various years (1973-2018 but very inconsistent).

    For the regression I want to make sure that although for var2 I might only have
    an observation in 1982 (not in 1980) I still want it to be regressed on by var1 1980.
    Does that make sense?

    So if and only if there are no observations for var2 in 1980 and 2007
    I want to retrieve the nearest possible year to 1980 and to 2007
    but also take into account that year difference. Is that possible?

    Otherwise I might miss out a lot of explanatory observations in certain years
    that are not necessarily overlapping with 1980 and 2007 of my dependent
    id year var1 var2 var3
    1 1980 yes no yes
    1 1988 no yes no
    1 2007 yes no yes
    1 2009 no yes no
    2 1973 no yes no
    2 1980 yes yes yes
    2 1999 no yes no
    2 2007 yes no yes
    2 2008 no yes no

    I have started with
    bysort country: egen minyear = min(year)
    bys country: gen distance_minyear = abs(1980-minyear)

    also for max, however I do not know how to proceed. Thanks in advance! I appreciate your help!

  • #2
    I'm not sure I got you right, but Stata adopts listwise deletion. Therefore, any observation with missing values in any variable will not be included in the regression.
    That said, making-up data is always risky, as you may end up with a sample that barely resembles the original one.
    Kind regards,
    (Stata 16.0 SE)


    • #3
      I'm not sure if that's a good idea to nudge the variables up or down the time scale. While I don't fully know the variables' potential causal pathway, I'm afraid you may end up nudging an "effect" upward to predict the past "cause." Perhaps consider multiple imputation for missing values?


      • #4
        My professor told me that it is okay to do so in my analysis, however I could not figure it out.