Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting OLS regression output with PCA as the dependent variable

    Hi There

    My question is similar to this post which had no suitable answer: http://stats.stackexchange.com/quest...n-a-regression

    I have created a principal component of municipal service delivery indicators and am wanting to use this as the dependent variable in my OLS regression. I then have numerous independent variables, mostly continuous or categorical.

    Could someone please explain how I go about interpreting the beta coefficients on these independent variables in relation to a dependent variable that is a principal component?
    I.e my dependent variable is Service Delivery PCA and an independent variable is the percentage of youth in a municipality. How do I then interpret that coefficient on the %youth variable in relation to the dependent variable?

    Thank you!




  • #2
    Presumably you are using the first principal component from a PCA.

    I've got to say that if the first PCA doesn't have a clear substantive interpretation it is not evident why this is a good idea at all.

    The plot I programmed as eofplot (SSC) is standard in some fields (but not all) as sometimes helping a little to see what PCs "mean".

    Similarly, plots of the PCs against the original variables often help too. See e.g. crossplot (SSC) which makes it easy (for example) to plot PC1 individually versus the original variables.

    Once you have a feeling for what the PC means -- or perhaps better what it does -- there is no special reasoning that doesn't apply in regression generally.
    Last edited by Nick Cox; 10 Nov 2016, 03:01.

    Comment


    • #3
      Hi Nick

      Thank you for your response.

      To clarify, I have an understanding of what the PC itself means. My question was around the next step, using the first component of a PCA as the dependent variable in an OLS regression (and then subsequently a fixed effects regression). How do I interpret the betas produced by OLS for the independent variables (continuous or categorical variables) in relation to the dependent principle component? Normally, this is something along the lines of "if x increases by one unit, Y increases by beta units" where the interpretation changes according to whether the variables are logged or leveled. So just wanting to understand how that interpretation changes when the dependent variable is neither a log or level variable, but a principle component?

      b). How does one figure out how much variation in the dependent variable each control variable explains? Is there a way to tease this out from OLS regression output?

      Thank you!

      Comment


      • #4
        Sorry, but I don't understand what you're asking that's different. A principal [NB] component is just a new variable scaled in a certain way. The regression doesn't know where it comes from and won't treat it in any special way.

        Comment


        • #5
          Stephanie:
          as far as your second question is concened, please see -estat esize-.

          PS: crossed in the cyberspace with Nick's reply, that adresses the substantive stuff of Stephanie's query.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Nick Cox - my question focuses on the independent variables, and how I interpret their coefficients (produced by the OLS regression) in relation to the dependent principle component. I understand that I have to ascribe meaning to the principle component, that's fine. But I don't know how the beta coefficient on the independent variables are interpreted - so if I have 0.53X1, does that mean that 'one unit increase in X1 results in 0.53 unit increase in the dependent, principal component variable' or does that meaning change when you have a component as the dependent variable and not a single level or logged variable?

            In this case, my variables used in the component are all continuous. So are you saying that the relationship between the dependent and independent variables is interpreted as per any normal OLS regression with continuous variables, except I have to ascribe meaning to the principal component?

            Comment


            • #7
              Yes; that summarises my message well. If your PCA come out of a correlation matrix then in effect the PCs are all scaled with mean 0 and also dimensionless. So coefficients are all per unit of each predictor.

              P.S. Standard spelling is, as said, principal component. You don't want your presentations to irritate the 1% of possible attendees who might be my clones.

              Comment


              • #8
                Great. Thank you very much.

                Comment

                Working...
                X