Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Number of obs changes when interaction is added using xtreg, fe

    Hello.

    I have an unbalance panel and use a fixed effects estimator with corrected standard errors (vce(robust)). I have a main variable of interest (l.x1) which I interact with a number of moderators (l.x2, x3, and x4). All variables are continuous. I noticed that as long as I use only two interactions

    Code:
    xtreg y c.l.x1##c.l.x2 c.l.x1##c.x3, fe vce(robust)
    the estimation runs on the full sample (n=438).

    However, as soon as I add one more interaction, i.e.

    Code:
    xtreg y c.l.x1##c.l.x2 c.l.x1##c.x3 c.l.x1##c.x4, fe vce(robust)
    the estimation runs on n=316. I know this is not due to missing values. All variables are available for the full sample.

    I'm using Stata 15, and when I run the estimation I do NOT receive an error message. Of course in the output I'm being told that l.x1 is omitted because of collinearity which I know happens since I'm interacting it with a number of different moderators.

    Thank you for any help.

    Best,
    Annette

  • #2
    I think that anyone who wants to try to answer this will need to see the actual output Stata gave you for these regressions. Some of the things you say don't make sense. For example, it is impossible that Stata runs the first regression on the full sample. This is because l.anything will always be undefined in the first observation of each panel, so you must lose at least one observation per panel in any regression involving the lag operator.

    Also, while you may not have gotten any error messages, Stata may have given you other diagnostic messages (warnings that do not cause Stata to halt) embedded in the iteration log of the output. Usually when Stata omits observations for reasons other than -if-, -in- or missing values, it tells you it is doing so and says why.

    So I suggest you carefully review the complete output of these regressions and see if you find an answer there. If not, then please post those complete outputs (N.B. Exactly as they are and complete--don't edit or omit anything!) here. To assure readability, please post the outputs within code delimiters.)

    Finally, use the -dataex- command to show an example of your data that exhibits this same strange behavior. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    Comment


    • #3
      Thank you, Clyde, for taking the time to respond. I was able to resolve the problem after many hours of trial and error and comparing the observations Stata included in each regression. Essentially, by lagging one of my variables, some "incorrect" values were created since my panel is unbalance. More specifically, for my main variable of interest, I had observations for 2000, 2001, and 2004 while the unit (firm, in my case) is in my dataset from 2000-2004 since firm financials are available annually. As I lagged my main variable of interest, I inadvertently created an observation for 2002, but it should not have been included since "other" variables (that are related to my main variable of interest) are not available for 2002. By including some of these "other" variables, I actually included the correct number of observations in my regression; hence, the drop of observations.

      Essentially, I'd like to caution others whenever they use lagged variables with an unbalanced panel to be careful and pay close attention to what is happening behind the scenes.

      Comment


      • #4
        Essentially, I'd like to caution others whenever they use lagged variables with an unbalanced panel to be careful and pay close attention to what is happening behind the scenes.
        A very good point, indeed.

        And thank you for closing the thread.

        Comment

        Working...
        X