Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel variables repeated time values

    Dear all,

    I am using panel data and I would like to know how to solve this issue:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year byte(id exp1_code_inds imp1_code_inds) str1 headq_link
    2011 1 40 30 "b"
    2011 1 50 20 "a"
    2011 1 50 30 "a"
    2011 1 40 20 "b"
    2011 1 40 40 "b"
    2011 1 50 40 "a"
    2012 1 50 20 "a"
    2012 1 40 30 "b"
    2012 1 50 40 "a"
    2012 1 50 30 "a"
    2012 1 40 40 "b"
    2012 1 40 20 "b"
    2013 1 40 20 "b"
    2013 1 50 30 "a"
    2013 1 40 40 "b"
    2013 1 50 20 "a"
    2013 1 50 40 "a"
    2013 1 40 30 "b"
    2011 2 30 20 "a"
    2011 2 30 40 "a"
    2011 2 40 30 "b"
    2011 2 40 20 "b"
    2011 2 40 40 "b"
    2011 2 30 30 "a"
    2012 2 30 30 "a"
    2012 2 40 40 "b"
    2012 2 40 20 "b"
    2012 2 30 40 "a"
    2012 2 40 30 "b"
    2012 2 30 20 "a"
    2013 2 40 30 "b"
    2013 2 40 20 "b"
    2013 2 30 40 "a"
    2013 2 30 30 "a"
    2013 2 30 20 "a"
    2013 2 40 40 "b"
    end
    format %ty year
    I need to use probit model where I have to xtset my panel var.

    I use this code:
    Code:
     xtset id year, yearly
    but stata gives me
    Code:
    .  xtset id year, yearly
    repeated time values within panel
    r(451);
    
    end of do-file
    
    r(451);
    I might know the issue (repeated id-years because of the exp and imp variables). But I do not how to solve this problem. Could you please help?

    Thank you

  • #2
    Well, the first question is whether you need to solve this problem at all. If you are not going to use time-series operators such as lags and leads, or model autoregressive structure, then you can just -xtset id- with no mention of the time variable and everything will be fine.

    If, however, you will need to use lags and leads or autoregressive structure, bear in mind that those things do not exist when there are multiple observations for the same panel at the same time. It is not just that Stata refuses to let you do them, it is that they are mathematically undefinable in that circumstance. So the only "fix" for this is to redefine your panel to be more specific so that there is only one observation per year per panel. The way that suggests itself here is:
    Code:
    egen long panel = group(id exp1_code_inds imp1_code_inds)
    xtset panel year
    That works with your example. The only issue now is whether this is a sensible panel structure in the context of your problem. If it is not, then you need to rethink your research goals, because you cannot do time-series operations/autoregressive modeling in data that does not have a true panel structure.

    Comment


    • #3
      Dear Clyde, Thank you for your answer.

      I actually tried using only -xtset id- without mentioning the time variable. Indeed, I will not use leads and lags. The observation in my analysis is firm in France that exports to sector 1 that imports from sector 2 at time t. The exporting sector, in which the firm operates, could have different sectors from which it imports some goods and services.

      If this is the case, do I still need to ignore the time variable?

      Comment


      • #4
        Yes, you still need to omit the time variable from the -xtset- command. You can only include it there if the combination of id and time uniquely identify observations.

        But what do you mean by "ignore the time variable?" Although you must omit it from the -xtset- command, you can still use it in all other normal ways. You can include c.time or i.time in your regression commands if that is appropriate.

        Comment


        • #5
          Yes, that is exactly what I meant. To omit it from the xtset.

          But what is the difference between:
          egen long panel = group(id exp1_code_inds imp1_code_inds)
          xtset panel

          and

          xtset id

          When I use "xtset id" stata gives me an error (panels are not nested within clusters).

          So I guess the right way to go is

          egen long panel = group(id exp1_code_inds imp1_code_inds)
          xtset panel

          Just to add that I am clustering SE at "exp1_code_inds-imp1_code_inds" pair by using (egen xm= group(exp1_code_inds imp1_code_inds))
          Thank you, Clyde
          Last edited by Jade Li; 19 Apr 2023, 02:35.

          Comment


          • #6
            But what is the difference between:
            egen long panel = group(id exp1_code_inds imp1_code_inds)
            xtset panel

            and

            xtset id
            The panel variable constructed by -egen- splits up the ids into multiple panels: each combination of exp1_code_inds and imp1_code_inds and a company id is treated as a separate panel. By contrast -xtset id- will treat all of the observations of the same company id, regardless of the industry codes, as a single panel.

            If you were not clustering your standard errors, you would not have encountered an error message with -xtset id-. But since you are clustering on pairs of exp1_code_inds and imp1_code_inds, you cannot use -xtset id- here. That's because when you do any -xt- regression model, the clusters must be sets of entire panels--you cannot have a single panel split over different clusters. If you -xtset id-, because each id is associated with many different combinations of the industry codes, each id is split over many panels--and that is not allowable. By contrast, with the approach using -egen long panel = group(id exp1_code_inds imp1_code_inds)-, no panel is split across different clusters because each panel defined in this way contains only observations with a single combination of the industry codes.

            Comment


            • #7
              Thank you, Clyde! Very informative.

              Comment

              Working...
              X