Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression Centering Help

    Hello again;

    I am looking at the effects of regulated childcare on the school readiness of low-income children in Canada. My measurement for school-readiness is based on the PPVT-R, Number Test and "Who Am I", which are tests that are given to four and five year olds.

    Therefore, these tests are going to be my dependent variables for each regression, and my independent variable (among some control variables) shall be participation in a regulated childcare facility (daycare/family care center).

    The problem is that the these test scores have ranges (ex: PPVT-R ranges from 50-160). I want to center the tests to their mean.

    What is the code to do this? That is, how do I center a variable for OLS regression?

    Do I even need to center? Or can I just run the regression with the variable as is.....


    To help, here is an example: the variable name for the PPVT-R standard scores is CPPCS01. (PS. I am assuming I cannot do a Logit Or Probit because the dependent variable does not range from 0-1, is this correct?)


    Thank you for your help!!!!
    Last edited by Paige LaPierre; 27 Sep 2021, 10:56.

  • #2
    The essence of centering (centring) on the mean is just

    Code:
    su foobar, meanonly 
    gen foobar_c = foobar - r(mean)
    with optionally some attention to variable labels. However,

    Code:
    ssc desc center
    tells you about a convenience command you may find ... convenient.

    Comment


    • #3
      although not completely clear, I think you are wondering whether to center your outcome (response, dependent) variable - in general, there is no reason to do so

      you are right that you cannot use logit/probit as is as the rule in Stata is that 0 is compared to all positive responses as the variable must be binary for logit/probit

      Comment


      • #4
        As Rich said, centering your y variable will have no substantive effect. It will only change the intercept in the regression. None of the coefficients you care about will be affected.

        Comment


        • #5
          This is great! Thank you so much

          Comment


          • #6
            Is your point the scale of the tests is different? So on one test the max score is 80 while on another test the max score is 160? If so, then depending on how you specify the DV you might get some results you'll have to fiddle with to make comparable. You might say an equivalent effect is 8 points on the first test and 16 points on the second (a 10% effect; a log DV would fix that, if 0 not a legit outcome). Rather than center on mean, are you asking about adjusting for the scale of the test? If so, just divide the test result by the maximum score and then you'll have a standard outcome across all the tests (it's a type of standardization rather than centering). This approach will affect the coefficients of the model. Might consider percentiles as outcomes as well.

            Comment

            Working...
            X