Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with interpreting incremental effects in an OLS regression

    Hope you can help interpreting these coefficients of my OLS model:

    Suppose we have a dependent variable Y and an independent variable X.

    X consists of three parts, X = X1 + X2 + X3. I run following Stata command:

    Code:
     reg Y X X2 X3
    This should give me the coefficient for X1 and the incremental effects for X2 and X3.

    Now suppose, I introduce an additional variable X4 which is also part of X but it is also reflected in all of X’s split variables X1, X2, X3. (so X = X1-a*X4+X2-bX4+X3-cX4+X4) where a+b+c = 1
    So I run:

    Code:
     reg Y X X2 X3 X4
    And now I'm somewhat lost - how can I interpret the coefficient of X4? As far as a I know, this controls for X4, meaning that X, X2, and X3 are now "free" of the influence of X4 (i.e. X4 is held constant). But how can I interpret the magnitude of the coefficient? How do I get the actual (not incremental) effect for X4 on Y?

    I would be very grateful if you have literature suggestions on these kind of problems!
    Last edited by Harald Leber; 18 Dec 2015, 08:40.

  • #2
    To the extent I understand what you're looking for, I don't think it exists.

    In any regression model, the coefficient of any variable always represents an effect that is contingent on the other variables in the model. If you were to just regress Y on X4 you would get what, in epidemiology, we call the crude effect of X4 on Y. But because X4 is known to also be associated with X1, X2, and X3, and those in turn are associated with Y, this crude effect would be, for most purposes, inappropriate. Instead, one would want, for most purposes, to look at the effect of X4 as estimated in a model that included X1 through X3 (as well as any other variables known to be associated with both X and X4). I realize that I am being somewhat vague here about "most purposes." Without knowing what your variables are and what your research question, it is really impossible to be more specific.

    In reality, there are often other variables, which we do not have information about, which are associated with X and X4, and their absence from the model results in omitted variable bias (also known as confounding). If X4 results from a randomized assignment, then this problem goes away, but otherwise it remains an issue, and is at the root of various approaches to attempting to identify causal effects from non-randomized data.

    If you want to say something more about your variables and the research question, some more specific advice might be available.

    Comment


    • #3
      Thanks for your help Clyde!

      Let me give you some details on the research design:

      Firms report two kinds of performance measures: earnings (E) and pro-forma earnings (PE).
      Pro-forma earnings usually exclude some cost items, so usually PE > E.
      My dependent variable is the diffference between PE - E = EDIFF, and I want to measure to what extent items that make up E (CFO, ACC,...) are excluded from PE (thus, end up in EDIFF) using regressions.

      So my first regression is something like:
      Code:
      reg EDIFF E
      Then follows (where CFO + ACC = E)
      Code:
      reg EDIFF CFO ACC
      Then follows (where CFO + ACC1 + ACC2 = E and ACC1 + ACC2 = ACC)
      Code:
      reg EDIFF CFO ACC1 ACC2
      So far, all variables are clean splits (e.g. ACC1 + ACC2 = ACC)

      Then I add a further variable for "special items" (= SPE) that is fully included in CFO, ACC1, and ACC2. However, I don't know to what extent these variables contain SPE, so I run:
      Code:
      reg EDIFF CFO ACC1 ACC2 SPE
      I would interpret the coefficients on CFO, ACC1 and ACC2 as the average proportion of these variables that firms exclude from E to measure PE holding SPE (which is included to some extent in CFO ACC1 and ACC2) constant. The coefficient on SPE on the other hand reflects some incremental effect I have difficulties to interpret...

      Does this make sense to you?

      Comment


      • #4
        I would interpret the coefficients on CFO, ACC1 and ACC2 as the average proportion of these variables that firms exclude from E to measure PE holding SPE (which is included to some extent in CFO ACC1 and ACC2) constant. The coefficient on SPE on the other hand reflects some incremental effect I have difficulties to interpret...
        I don't quite agree with that. In your first three regresions, that would be true. But in the fourth, those interpretations of the coefficients of CFO, ACC1 and ACC2 no longer apply, because they share variance with SPE, so some of their variance will be "sucked up" by the inclusion of SPE. The coefficient of SPE has no simple interpretation in this model either. And the coefficients will not sum to 1 (unless they do so by coincidence).

        I think if you want to know the contribution of SPE to EDIFF you just -reg EDIFF SPE-. You might want to also regress each of CFO, ACC1, and ACC2 on SPE to see how much of each of those is typically accounted for by SPE. But that's all I can see doing with these variables.

        Comment


        • #5
          Thank you Clyde! I have an additional question w.r.t. the same setting. As mentioned above, I want to use OLS regressions to capture the extent to which a firm excludes CFO and/or ACC (see 2nd equation in my last post) from E to arrive at PE. In other words, I want to figure out, what drives the difference between PE and E (EDIFF)? However, I can actually not observe the composition of EDIFF directly.

          If I run this regression
          Code:
          reg EDIFF CFO ACC
          now on a firm by firm basis (so by subject using a time series of observations), can I use the coefficient on ACC as a measure of how much of ACC a firm excludes from PE (i.e., how much of ACC ends up in EDIFF)? Or is there a better way to create a firm-specific measure of how much of ACC and CFO exclude from E to arrive at PE?

          Comment

          Working...
          X