Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting coefficient when x and y are scaled

    Hi,

    I am a bit confused in terms of how to interpret the coefficient on a variable X when this variable is a divided by a scaling variable Z, and where the Y variable is also divided by the same scaling variable Z.

    So we have a regression where Y/Z is the dependent variable and where X/Z is the only independent variable. I get a significant coefficient estimate of 0.008. Any idea how this coefficient can be interpreted to describe the effect of my independent variable on my dependent variable?

    Thanks, Ali

  • #2
    Code:
    . set obs 100
    number of observations (_N) was 0, now 100
    
    . gen y=_n
    
    . gen x=2*y
    
    . reg x y
    
          Source |       SS           df       MS      Number of obs   =       100
    -------------+----------------------------------   F(1, 98)        =         .
           Model |      333300         1      333300   Prob > F        =         .
        Residual |           0        98           0   R-squared       =    1.0000
    -------------+----------------------------------   Adj R-squared   =    1.0000
           Total |      333300        99  3366.66667   Root MSE        =         0
    
    ------------------------------------------------------------------------------
               x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               y |          2          .        .       .            .           .
           _cons |  -2.84e-14          .        .       .            .           .
    ------------------------------------------------------------------------------
    
    . gen yscaled=y/10
    
    . gen xscaled=x/10
    
    . reg xscaled yscaled
    
          Source |       SS           df       MS      Number of obs   =       100
    -------------+----------------------------------   F(1, 98)        =         .
           Model |        3333         1        3333   Prob > F        =         .
        Residual |           0        98           0   R-squared       =    1.0000
    -------------+----------------------------------   Adj R-squared   =    1.0000
           Total |        3333        99  33.6666667   Root MSE        =         0
    
    ------------------------------------------------------------------------------
         xscaled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         yscaled |          2          .        .       .            .           .
           _cons |  -1.78e-15          .        .       .            .           .
    ------------------------------------------------------------------------------

    Comment


    • #3
      Hi Jorrit,

      I should have added that both the dependent and independent variables are by definition between 0 and 1. How would the coefficient then be interpreted?

      Comment


      • #4
        Not much. Transformations other than linear would affect interpretation of the coefficient, but not this.

        Code:
        . set obs 100
        number of observations (_N) was 0, now 100
        
        . gen y=runiform()
        
        . gen x=runiform()
        
        . reg x y
        
              Source |       SS           df       MS      Number of obs   =       100
        -------------+----------------------------------   F(1, 98)        =      0.11
               Model |  .009173456         1  .009173456   Prob > F        =    0.7377
            Residual |  7.96929992        98  .081319387   R-squared       =    0.0011
        -------------+----------------------------------   Adj R-squared   =   -0.0090
               Total |  7.97847338        99   .08059064   Root MSE        =    .28517
        
        ------------------------------------------------------------------------------
                   x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                   y |   .0311836   .0928446     0.34   0.738    -.1530636    .2154308
               _cons |   .4946992   .0542711     9.12   0.000         .387    .6023983
        ------------------------------------------------------------------------------
        
        . gen yscaled=y/10
        
        . gen xscaled=x/10
        
        . reg xscaled yscaled
        
              Source |       SS           df       MS      Number of obs   =       100
        -------------+----------------------------------   F(1, 98)        =      0.11
               Model |  .000091735         1  .000091735   Prob > F        =    0.7377
            Residual |  .079692999        98  .000813194   R-squared       =    0.0011
        -------------+----------------------------------   Adj R-squared   =   -0.0090
               Total |  .079784733        99  .000805906   Root MSE        =    .02852
        
        ------------------------------------------------------------------------------
             xscaled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             yscaled |   .0311836   .0928446     0.34   0.738    -.1530636    .2154308
               _cons |   .0494699   .0054271     9.12   0.000        .0387    .0602398
        ------------------------------------------------------------------------------
        
        .
        Additionally, to see linear transformation of one of the variables:
        Code:
        . reg x yscaled
        
              Source |       SS           df       MS      Number of obs   =       100
        -------------+----------------------------------   F(1, 98)        =      0.11
               Model |  .009173457         1  .009173457   Prob > F        =    0.7377
            Residual |  7.96929992        98  .081319387   R-squared       =    0.0011
        -------------+----------------------------------   Adj R-squared   =   -0.0090
               Total |  7.97847338        99   .08059064   Root MSE        =    .28517
        
        ------------------------------------------------------------------------------
                   x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             yscaled |    .311836   .9284464     0.34   0.738    -1.530636    2.154308
               _cons |   .4946992   .0542711     9.12   0.000         .387    .6023983
        ------------------------------------------------------------------------------
        
        .

        Comment


        • #5
          Thanks for the useful help!

          In the ''reg xscaled yscaled'' regression, How would you fill in the following:

          If yscaled increases by ... then xscaled increases by ... ?

          Comment


          • #6
            If yscaled increases by 1 then xscaled increases by 0.0311

            Comment


            • #7
              Thats what I was thinking. But keeping in mind that both my variables can take values between 0 and 1, the above interpretation would not be possible, right?

              In this situation, is it even possible to interpret the effect of the coefficient?

              Comment


              • #8
                The statement "If yscaled increases by 1 then xscaled increases by 0.0311" refers to the slope of the regression line, which has the same slope at any value.

                Because of the scaling here, yscaled and xscaled here have values of between 0 and 0.1. The slope remains the same, although you are correct that yscaled, in this sample, never sees an increase of 1.
                I would always still report the slope as a value relative to an increase of 1 in the explanatory variables. Most analyses include a descriptive table where people can see minima and maxima of different variables. You can also point to it in the text of your analysis, that the explanatory var never increases by that much.
                You could also say "If yscaled increases by 0.1 then xscaled increases by 0.00311", but its the exact same slope regardless.

                Comment


                • #9
                  You might also consider doing the rescaling for the reader so that you also state the results for the reader in terms of the original x and y. However, there is a separate estimation question about y being limited in range. If either of the boundaries is binding, you may need to use tobit or something like that.

                  Comment

                  Working...
                  X