Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the purpose of 'demeaning' or 'centering' panel data?

    Dear all,

    My work consists on the Paper of Jordà, Schularick and Taylor (2016) - Sovereigns vs. Banks: Credit Crisis and Consequences which is available under
    https://academic.oup.com/jeea/articl...4/1/45/2319810

    In the paper, the authors state that they compute the treatment terms and the controls as levels relative to their means.

    I know that demeaning means subtracting the sample mean from each observation so that they are mean zero. The corresponding command for stata is
    Code:
    center
    What I don´t understand is the purpose of it. When should it be done? Or what is its use?

    Thank you very much in advance,
    Silvia

  • #2
    To center a variable means different things in different context, and the purpose of that is correspondingly different. When you have panel data, say different banks observed over time, then it typically means that for each annual observation of bank A the mean of Bank A is subtracted, and for each annual observation of bank B the mean of bank B is subtracted. Under certain circumstances a linear regression with these variables will be a fixed effects model, i.e. a model that shows the effect of a variable while keeping all observed and unobserved time constant characteristics of each bank constant. Sometimes that is a nice and helpful trick, but if you are uncertain than probably the safest way to estimate a fixed effects regression is to use xtreg with the fe option.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Maarten addresses one important use of centering - as essentially estimating fixed effects. Let me note a few others. Often, researchers will subtract an industry value (often an annual average) from the firm value to try to compensate for inter-industry differences. When the scale of a variable is not naturally interpretable, some will normalize (center and make standard deviation 1) to facilitate interpretation. In some areas, researchers center values to reduce colinearity especially when including interactions. I am less confident of the benefit from this kind of centering (see Besley, Kuh and Welsch, Regression Diagnostics.

      The article cited appears to use different means for centering in different periods. I can't speculate on whether this makes sense in this context.

      Comment


      • #4
        Thank you Maarten and Phil for your answers, which where very helpful to me!

        I am trying to estimate the following fixed-effects panel regression (from the above mentioned paper, page 63):

        Click image for larger version

Name:	regression.png
Views:	1
Size:	51.7 KB
ID:	1421174

        In their text, the authors say:

        Click image for larger version

Name:	text.png
Views:	1
Size:	107.9 KB
ID:	1421175


        What I don´t understand is why the centering of Y and X has an influence on the intercept terms?

        Comment


        • #5
          Imagine that your variable is equal to the mean, hence x_i-x_bar will be 0. Then the coefficient of the demeaned variable will be multiplying with 0. What is left untouched is the intercept. Hence the individual intercept (I think in you case alpha_i) provides the (expected) outcome of the i with characteristics equal to the mean.
          Sorry for not linking it more closely to the provided estimation, but it's not the prettiest representation the authors could have written down IMHO.

          Comment


          • #6
            The intercept is the predicted value for your dependent/explained/left-hand-side/y-variable when all independent/explanatory/right-hand-side/x-variables are 0. So making sure the value 0 for the x-variables is some meaningful value within the range of the data is often helpful.

            However, I would only use the mean as a last resort, as there are usually more meaningful values that are stable across datasets. Say you have occupational status as your variable that you want to center, then the "middle" will typically be some skilled manual labor job, so why not center at the occupational status of a watchmaker. It is much easier to communicate to your audience "This is the expected y for a male, 40 year old watchmaker", than "This is the expected y for a person of average gender, age, and occupation". This example also shows that the mean isn't useful for all types of variables.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              To Felix:

              Thank you very much for your response! And even if you haven´t linked it to the provided estimation, that already helped a lot! I think I can follow now what the authors have mentioned! Thank you very much!

              Comment


              • #8
                To Maarten: Thank you also very much for your response! Maybe I will rethink it, if it really makes sense here!

                Comment

                Working...
                X