Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does interacting time dummy and individual dummy lead to overfitting?

    Dear Statalists,

    I have a question about whether interacting time dummy (time fixed effects) and individual dummy (individual fixed effects) leads to overfitting. I remember that I saw this method somewhere (but I cannot remember exactly). However, introducing such interactions will make the number of independent variables (parameters to be estimated) explode and even outnumber the number of observations. This results in lacking degree of freedom and overfitting. So can I ask if interacting time dummy and individual dummy is a feasible method?

    I have tried estimating a model with such interactions by -xtreg y x1 x2 i.time i.time#i.i, fe- (and also alternatively manually creating the interactions) and get results of within R-sauqre equal to 1.000 and all p-values equal to 0.000. The perfect R-square may be a clear evidence of overfitting, let lone the huge number of independent variables

    When I alternatively estimate the model with LSDV -reg y x1 x2 i.time i.i i.time#i.i-, then the returned regression table shows only R-square=1.000 and coefficients of each variable, but all other statistics (e.g. std. err., t, p, and conf. interval) are missing (blank) in the table.

    I think that introducing the interaction between time dummy and individual dummy is a wrong practice, but I am not completely if I am right.

    Thank you very much
    Last edited by Alex Mai; 27 Aug 2018, 08:24.

  • #2
    Yes, this would lead to overfitting, if you only have one observation for each combination of time dummy and individual dummy. Then you have, for every observation, one variable (the interaction variable) that is 1 in this observation and 0 in all other observations. The result is that the coefficient of that interaction variable, in your example i.time#i.i, will be not an estimation, but determined precisely by just one equation.

    Comment


    • #3
      Thanks a lot. Actually, another concern is that time dummy is supposed to have a common impact on all individuals (if I am not wrong), thus perhaps interacting time dummy and individual dummy does not make much sense
      Last edited by Alex Mai; 27 Aug 2018, 12:00.

      Comment


      • #4
        There are certainly situations where you want to let parameters vary by time and in such cases interacting a time variable with other variables makes sense. But as Max pointed out, if you only have one observation per person per year and interact i.time with i.persons, you're fitting at least one parameter for each observation which is crazy. [If your panel variable is persons, then xtreg fe makes it even worse.] This isn't overfitting in the normal sense (where you try stuff out to find something that fits), this is simply meaningless.

        Comment


        • #5
          Originally posted by Phil Bromiley View Post
          There are certainly situations where you want to let parameters vary by time and in such cases interacting a time variable with other variables makes sense. But as Max pointed out, if you only have one observation per person per year and interact i.time with i.persons, you're fitting at least one parameter for each observation which is crazy. [If your panel variable is persons, then xtreg fe makes it even worse.] This isn't overfitting in the normal sense (where you try stuff out to find something that fits), this is simply meaningless.
          Dear Phil,

          Thank you very much. Actually this idea of interacting time dummy and individual dummy (in my case, country) is from a comment I received. The comment is like this: to control for country and time specific changes over years (e.g. changes in a country's sectoral composition) by adding the interaction effects between country and years to the model.

          My interpretation of this comment is that I should add interaction terms between the country dummy and the year dummy (as what I have described at the beginning of this thread). But this is apparently problematic and meaningless. So do you think that my interpretation of the comment is correct?

          Many thanks again!
          Last edited by Alex Mai; 28 Aug 2018, 13:11.

          Comment


          • #6
            I'll jump in here, if nobody minds. I am no more telepathic than you or Phil Bromiley or the next person, but my experience suggests that you have probably interpreted the commenter's intent correctly. It would not be the first time that a reviewer has suggested something that is impossible, or absurd, or counter-productive. Adding an interaction between time and country is only possible if you have multiple observations of each country in each year (in which case you would not have been able to -xtset country year-). And even if possible, probably is only sensible if you have many such observations. I think the commenter either did not properly understand the structure of your data, or just didn't think through his or her comments.

            I would simply respond to the commenter clearly and politely explaining why your data would not permit following the suggestion.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              I'll jump in here, if nobody minds. I am no more telepathic than you or Phil Bromiley or the next person, but my experience suggests that you have probably interpreted the commenter's intent correctly. It would not be the first time that a reviewer has suggested something that is impossible, or absurd, or counter-productive. Adding an interaction between time and country is only possible if you have multiple observations of each country in each year (in which case you would not have been able to -xtset country year-). And even if possible, probably is only sensible if you have many such observations. I think the commenter either did not properly understand the structure of your data, or just didn't think through his or her comments.

              I would simply respond to the commenter clearly and politely explaining why your data would not permit following the suggestion.
              Dear Clyde,

              Thank you very much for your suggestion! It's very helpful for me. This problem has really baffled me.

              Comment

              Working...
              X