Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeated time values within panel

    Hi All,

    I have a multi-dimensional panel data with four identifying variables - retailer, home country, host country, and year (2001-2015). I am trying to declare my panel dataset time series in order to run xtgls but I keep getting "repeated time values within panel" error message when I use the xtset command. I looked through the forum and came across a few possible solutions but neither of them worked. From what I understand, I need to make my panel data from four to two-dimensional and I tried the following: egen pan_id = group(retailer hostcountry homecountry) xtset pan_id year, but I still get the same error message. Any suggestions would be greatly appreciated.

    Thanks!

  • #2
    Well, the problem is not with your code, which looks completely correct for the situation, but with either your data or your understanding of the data.

    I have never known Stata to be wrong when it says that there are repeated time values within panel. So we should take it as a given that they are there. At the top level there are two possibilities:

    1. You think that your data is supposed to be four dimensional (retailer, home country, host country, and year) but it is really 5 or more dimensional. In that case you need to either get a new data set that conforms to your expections, or revise your expectations and go with this one.

    2. Your data set is supposed to be four dimensional in accord with your expectations, but it contains errors. To find them:

    Code:
    duplicates tag retailer hostcountry homecountry year, gen(flag)
    sort retailer hostcountry homecountry year
    browse if flag
    You will see where the repeated time values within panel are coming from. Then you will have to decide how to to fix this. Some observations, perhaps all, may be pure duplicates on all variables, and these can then just be dropped with no loss of information. But you may find that some of the surplus observations have disagreements on the values of other variables. In that case you will have to figure out how to reconcile those conflicts either by picking one observation as correct, or replacing the ensemble with a single observation that aggregates the disagreeing values into means or medians, or max, min, first, last, or some other scheme, whatever is most appropriate to your situation.

    Comment


    • #3
      Thank you! I actually just figured out what the problem was. Store format is also one of the dimensions so I had to make changes to the code I used and it worked.

      Comment

      Working...
      X