Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed-effects regression: isolating the effect of change.

    I have data in long format and want to investigate the effect of a change in exposure (e.g. cigarettes/week) upon a change in a dependent variable (say lung capacity).

    One thing that seems sensible to do in order to isolate the effect of a longitudinal change in smoking upon a longitudinal change in lung capacity would be to adjust for lung capacity at baseline.

    If I were to investigate the effect of changes in smoking behaviour using a fixed-effects regression, would this adjustment still be necessary?

    For example, if someone had a baseline lung capacity of 6 litres, their baseline value would be coded in a variable as 6@t0 and 6@t1. In this case the variable would be omitted as it is time-invariant over the course of the study, like sex.

  • #2
    As you yourself noticed, in a fixed-effects regression you cannot incorporate a baseline variable. And even if you could, there is no real reason to do that. A fixed-effects regression is precisely a within regression. It estimates exactly what you are looking for: the association of within-person changes in exposure to within-person changes in outcome. So you can just use the fixed-effects model and forget about using the baseline value as a covariate.

    Comment


    • #3
      Clyde is, of course, correct so far. However, a within regression framework cannot tell you about the direction of the effect you want to estimate. So you cannot know whether the change in smoking caused the changed in lung capacity or the other way round. A baseline would not change that, but a lagged outcome, i.e. dynamic panel model, could (theoretically). If you have more than two measures per individual you might want to consider xtabond or related estimators.

      Best
      Daniel

      Comment


      • #4
        Originally posted by daniel klein View Post
        Clyde is, of course, correct so far. However, a within regression framework cannot tell you about the direction of the effect you want to estimate. So you cannot know whether the change in smoking caused the changed in lung capacity or the other way round. A baseline would not change that, but a lagged outcome, i.e. dynamic panel model, could (theoretically). If you have more than two measures per individual you might want to consider xtabond or related estimators.

        Best
        Daniel
        Bugger.

        Am I right to think that this is because the change in both exposure and outcome are each occurring over the same period of time?

        So no use if I were to derive some lagged categorical exposure variables, such as 1 "no change [ref]" 2 "decrease of 1-3 cigs/day" 3 "decrease >3 cigs/day"?

        I've taken a quick look through the Stata manual for -xtabond-. and its use of lagged dependent variables. Could I not achieve a similar thing by creating a lagged variable coded something along the lines of 6@t0 and 0@t1? Probably a very stupid question.
        Last edited by Craig Knott; 02 Feb 2017, 01:53.

        Comment


        • #5
          Originally posted by Craig Knott View Post
          [...]
          Am I right to think that this is because the change in both exposure and outcome are each occurring over the same period of time?
          Basically yes. It is the same problem as with cross-sectional data. The only thing that the within transformation does is remove unobserved confounders that are constant across time points or, more general, constant within panel units. What you get is essentially still a correlation. Whether a simple mean is an appropriate model for the unobservables within panel units can also be questions, but I do not want to go there.

          Originally posted by Craig Knott View Post
          So no use if I were to derive some lagged categorical exposure variables, such as 1 "no change [ref]" 2 "decrease of 1-3 cigs/day" 3 "decrease >3 cigs/day"?
          The change is basically captured in the model, so no need to explicitly create a variable for it. I also do not see any reason to introduce a categorical measure when you have more precise information. Anyway, the idea of using lagged predictors goes in the direction of Granger causality (although you would included lagged outcomes there, too), I think. While intuitively appealing, by using lagged predictors you are essentially shifting the endogeneity problem backwards but you are not eliminating it. I think you could identify a causal effect with this strategy if you were willing to assume that the unobservables are not dynamic in nature.

          In general, I think it is worth thinking about the assumptions that we are making when we specify our models. It is usually deemed useful to compare the results we get from models that make different assumptions. However, at some point we will have to accept that there is no gold standard to get causal effects from observational data.

          Best
          Daniel
          Last edited by daniel klein; 02 Feb 2017, 02:05.

          Comment


          • #6
            Final question for the moment:

            Seeing as how the fixed-effects regression would report the difference in the change in lung function among smokers who changed their smoking behaviour over time, relative to the change in lung function among those that did not change their smoking behaviour (a bit of a mouthful), would I be pushing it to see the resulting model as a difference-in-differences analysis?

            Comment


            • #7
              I do not think your are pushing. The within regression framework with additional indicator variables for period resembles a diff-in-diff approach.

              Best
              Daniel

              Comment

              Working...
              X