Interpreting coefficient when x and y are scaled

Ali Malik

Join Date: Jul 2018

Posts: 23
#1

Interpreting coefficient when x and y are scaled

08 Aug 2018, 07:32

Hi,

I am a bit confused in terms of how to interpret the coefficient on a variable X when this variable is a divided by a scaling variable Z, and where the Y variable is also divided by the same scaling variable Z.

So we have a regression where Y/Z is the dependent variable and where X/Z is the only independent variable. I get a significant coefficient estimate of 0.008. Any idea how this coefficient can be interpreted to describe the effect of my independent variable on my dependent variable?

Thanks, Ali
Tags: None

Jorrit Gosens

Join Date: Jan 2015
Posts: 1019

08 Aug 2018, 07:46

Code:

. set obs 100
number of observations (_N) was 0, now 100

. gen y=_n

. gen x=2*y

. reg x y

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =         .
       Model |      333300         1      333300   Prob > F        =         .
    Residual |           0        98           0   R-squared       =    1.0000
-------------+----------------------------------   Adj R-squared   =    1.0000
       Total |      333300        99  3366.66667   Root MSE        =         0

------------------------------------------------------------------------------
           x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           y |          2          .        .       .            .           .
       _cons |  -2.84e-14          .        .       .            .           .
------------------------------------------------------------------------------

. gen yscaled=y/10

. gen xscaled=x/10

. reg xscaled yscaled

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =         .
       Model |        3333         1        3333   Prob > F        =         .
    Residual |           0        98           0   R-squared       =    1.0000
-------------+----------------------------------   Adj R-squared   =    1.0000
       Total |        3333        99  33.6666667   Root MSE        =         0

------------------------------------------------------------------------------
     xscaled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     yscaled |          2          .        .       .            .           .
       _cons |  -1.78e-15          .        .       .            .           .
------------------------------------------------------------------------------

Comment

Ali Malik

Join Date: Jul 2018

Posts: 23
#3

08 Aug 2018, 08:15

Hi Jorrit,

I should have added that both the dependent and independent variables are by definition between 0 and 1. How would the coefficient then be interpreted?
Comment

Jorrit Gosens

Join Date: Jan 2015
Posts: 1019

08 Aug 2018, 08:21

Not much. Transformations other than linear would affect interpretation of the coefficient, but not this.

Code:

. set obs 100
number of observations (_N) was 0, now 100

. gen y=runiform()

. gen x=runiform()

. reg x y

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.11
       Model |  .009173456         1  .009173456   Prob > F        =    0.7377
    Residual |  7.96929992        98  .081319387   R-squared       =    0.0011
-------------+----------------------------------   Adj R-squared   =   -0.0090
       Total |  7.97847338        99   .08059064   Root MSE        =    .28517

------------------------------------------------------------------------------
           x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           y |   .0311836   .0928446     0.34   0.738    -.1530636    .2154308
       _cons |   .4946992   .0542711     9.12   0.000         .387    .6023983
------------------------------------------------------------------------------

. gen yscaled=y/10

. gen xscaled=x/10

. reg xscaled yscaled

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.11
       Model |  .000091735         1  .000091735   Prob > F        =    0.7377
    Residual |  .079692999        98  .000813194   R-squared       =    0.0011
-------------+----------------------------------   Adj R-squared   =   -0.0090
       Total |  .079784733        99  .000805906   Root MSE        =    .02852

------------------------------------------------------------------------------
     xscaled |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     yscaled |   .0311836   .0928446     0.34   0.738    -.1530636    .2154308
       _cons |   .0494699   .0054271     9.12   0.000        .0387    .0602398
------------------------------------------------------------------------------

.

Additionally, to see linear transformation of one of the variables:

Code:

. reg x yscaled

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.11
       Model |  .009173457         1  .009173457   Prob > F        =    0.7377
    Residual |  7.96929992        98  .081319387   R-squared       =    0.0011
-------------+----------------------------------   Adj R-squared   =   -0.0090
       Total |  7.97847338        99   .08059064   Root MSE        =    .28517

------------------------------------------------------------------------------
           x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     yscaled |    .311836   .9284464     0.34   0.738    -1.530636    2.154308
       _cons |   .4946992   .0542711     9.12   0.000         .387    .6023983
------------------------------------------------------------------------------

.

Comment

Ali Malik

Join Date: Jul 2018

Posts: 23
#5

08 Aug 2018, 08:35

Thanks for the useful help!

In the ''reg xscaled yscaled'' regression, How would you fill in the following:

If yscaled increases by ... then xscaled increases by ... ?
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#6

08 Aug 2018, 08:39

If yscaled increases by 1 then xscaled increases by 0.0311
Comment
Ali Malik

Join Date: Jul 2018

Posts: 23
#7

08 Aug 2018, 08:54

Thats what I was thinking. But keeping in mind that both my variables can take values between 0 and 1, the above interpretation would not be possible, right?

In this situation, is it even possible to interpret the effect of the coefficient?
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#8

08 Aug 2018, 09:09

The statement "If yscaled increases by 1 then xscaled increases by 0.0311" refers to the slope of the regression line, which has the same slope at any value.

Because of the scaling here, yscaled and xscaled here have values of between 0 and 0.1. The slope remains the same, although you are correct that yscaled, in this sample, never sees an increase of 1.
I would always still report the slope as a value relative to an increase of 1 in the explanatory variables. Most analyses include a descriptive table where people can see minima and maxima of different variables. You can also point to it in the text of your analysis, that the explanatory var never increases by that much.
You could also say "If yscaled increases by 0.1 then xscaled increases by 0.00311", but its the exact same slope regardless.
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#9

15 Aug 2018, 15:05

You might also consider doing the rescaling for the reader so that you also state the results for the reader in terms of the original x and y. However, there is a separate estimation question about y being limited in range. If either of the boundaries is binding, you may need to use tobit or something like that.
Comment

Announcement

Interpreting coefficient when x and y are scaled

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment