Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to fix "repeated time values within panel"

    Now I have data like:
    hhid geo
    1 1
    1 1
    2 2
    2 2
    3 3
    3 3

    And I use command "xtset hhid geo", there is a error that "repeated time values within panel".
    I know this is because hhid is repeated, but I don't want to use "duplicates drop hhid geo, force" because I will lose half of data. hhid means "household id", so even though the hhid are same, the persons they represent are not same.
    In this case, how can I fix this error without delete any data?
    I'm looking forward to your help, thank you!

  • #2
    So, I assume that, despite the name, geo is some kind of time variable or repetition/replication counter. You state that the duplicates appear because the observations refer to different individuals within the same household. So you need to have another variable that distinguishes the different individuals within the households. Let's call that variable pid. Then
    Code:
    egen long person = group(hhid pid)
    xtset person geo
    This will group your observations at the person level, rather than the household level and there will be no repeated values of geo within person if your data are as you described them in #1.

    Now, if you need the grouping of your observations to be at the household level, then you have a different problem and it is simply not possible to -xtset- the data with household as the panel and include a time variable. But, remember that -xtset- does not require you to specify a time variable. You can just -xtset hhid- and Stata will not care about anything happening within the hhid groups. What do you lose by not specifying the time variable? You lose the ability to use lag and lead and other time-series operators, and certain analyses that rely on autoregressive correlation structure. But often you have no need for those things anyway.

    Finally, I will just note that it sounds like what you really have is three level data with repeated observations (geo) nested within persons nested within households. In that case, your analyses probably would be better if they reflect that full hierarchical structure. That implies using the -me- commands rather than the -xt- commands, and if you go that route, there is no need to -xtset- the data at all.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      So, I assume that, despite the name, geo is some kind of time variable or repetition/replication counter. You state that the duplicates appear because the observations refer to different individuals within the same household. So you need to have another variable that distinguishes the different individuals within the households. Let's call that variable pid. Then
      Code:
      egen long person = group(hhid pid)
      xtset person geo
      This will group your observations at the person level, rather than the household level and there will be no repeated values of geo within person if your data are as you described them in #1.

      Now, if you need the grouping of your observations to be at the household level, then you have a different problem and it is simply not possible to -xtset- the data with household as the panel and include a time variable. But, remember that -xtset- does not require you to specify a time variable. You can just -xtset hhid- and Stata will not care about anything happening within the hhid groups. What do you lose by not specifying the time variable? You lose the ability to use lag and lead and other time-series operators, and certain analyses that rely on autoregressive correlation structure. But often you have no need for those things anyway.

      Finally, I will just note that it sounds like what you really have is three level data with repeated observations (geo) nested within persons nested within households. In that case, your analyses probably would be better if they reflect that full hierarchical structure. That implies using the -me- commands rather than the -xt- commands, and if you go that route, there is no need to -xtset- the data at all.
      Thank you for your reply! It really made sense! I think just using "xtset hhid" is the best method. Because I need use "xtreg Y X, fe" to check the fixed effect of hhid, if I set the panel data in person level, this command may not show the fixed effect for me.

      Comment

      Working...
      X