Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it an issue to use relative values (percentages) in panel regression?

    Hello all,

    recently I have been at an interesting statistics seminar where the speaker said that using percentages in panel regression should not be done (so for example, one should use the absolute number of a CPI index instead of the percentual value of inflation, etc). Can I ask whether this is true, and what are the limits of this?
    In my upcoming project, I was planning to use percentages as both the explained variable, and also as some of the explaining ones. For example, the proportion of certain taxes in a tax mix, unemployment rate, interest rate, GDP growth rate... So you can likely see that such revelation would be quite problematic.

    Thank you to anyone giving me their opinion.

    Edit: I seem to recall that the point was to NOT use incremental values (as would be the case for inflation). This would then likely mean that using, for example, proportions relevant only to the current year (such as a certain tax revenue on a tax mix) should be fine? What about the other values, however?
    Last edited by Jonathan Quimby; 05 May 2024, 13:48.

  • #2
    Issues arise if percentage outcomes really are constrained by [0, 100] -- which is the same issue as whether linear probability models are a good idea for outcomes in [0, 1].

    They also arise if there are several outcomes which must add to 100 or 1. The standard term is compositional data.

    But the focus in #1 would seem to be on % change, where values can be negative, zero or positive with occasional outliers.

    Comment


    • #3
      Thank you for your answer, Nick.

      Do I interpret what you are saying correctly if I say that:
      1) using the proportion of a single tax in a tax mix is fine, as long as i don't use more of them (since together, they would make up 100%)?
      2) using interest rate, GDP growth rate and inflation (as %) is fine, since they are not bound by 100%?
      3) using unemployment rate is not fine, as it is bound by 100%?

      On a related note, wouldn't some relationships be lost if we use these percentual increments? Between the individual time values, nothing would change, but the longer-term relationships could not be observed anymore if we use the increments (%) instead of absolute numbers. Or am I wrong in assuming this?

      Comment


      • #4
        1) Yes, I guess, if that is a predictor.

        2) I guess you're an economist and know something about distributions that are typical for these variables. Runaway inflation might be hard to square with plain or vanilla regression. That is, skewness and outliers indirectly might make fits difficult. Depends on the data

        3) Unemployment rate is bounded above by 100% but the bound doesn't really bite in most datasets I've heard of. See also just above.

        Ultimately the only check here is to run quite different models and check whether they give similar results. If so, choose the simplest model.

        Comment

        Working...
        X