Interpreting one unit change from xtreg

Adam Mitchell

Join Date: Oct 2021

Posts: 56
#1

Interpreting one unit change from xtreg

16 Jan 2023, 14:26

I recently ran a longitudinal mixed model analysis using the xtreg command. My initial interpretation is that for a one unit change in x, there is a certain change in y. However as my x value ranges between 0 - 1 how does one interpret a one unit change?

Thanks
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10190
#2

16 Jan 2023, 23:17

If the variable is a proportion, you can divide the coefficient by 100 (or multiply the variable by 100 and rerun the regression) to express the change in percentage points.
Comment
Adam Mitchell

Join Date: Oct 2021

Posts: 56
#3

17 Jan 2023, 01:00

It isn't a proportion it is an index, comprised of different variables that represent deficits in health status. So 0 representing best of health and 1 representing worst state of health.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1385
#4

17 Jan 2023, 01:09

It seems like you have your interpretation, Adam.

In general, indices don't have straightforward and neat interpretations since they are a composite of several variables.
Comment
Adam Mitchell

Join Date: Oct 2021

Posts: 56
#5

17 Jan 2023, 01:12

So if I state a one unit change in this composite variable that ranges from 0 - 1 what does one unit actually represent here? Moving from 0 to 1 or some point in between for example 0.1 ?
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1385
#6

17 Jan 2023, 01:20

A unit change is a change of magnitude 1 (in the units of the variable), so yes, it is a move from 0 to 1.
Comment
Adam Mitchell

Join Date: Oct 2021

Posts: 56
#7

17 Jan 2023, 01:22

Ok that's my concern because in this data it is impossible as nobody has a value of 1 in this variable.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17706

17 Jan 2023, 01:22

Adam:
as per Andrew's helpful suggestion, you can adapt index scale at your conevience (all in all, it's only a matter of scale):

Code:

. set obs 100
Number of observations (_N) was 0, now 100.

. g y=runiform()

. range x 0 1 100

. range z 0 .1 100

. regress y x

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.67
       Model |  .063926465         1  .063926465   Prob > F        =    0.4155
    Residual |  9.36973749        98  .095609566   R-squared       =    0.0068
-------------+----------------------------------   Adj R-squared   =   -0.0034
       Total |  9.43366396        99  .095289535   Root MSE        =    .30921

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |  -.0867137    .106047    -0.82   0.416    -.2971605     .123733
       _cons |   .5406958   .0613807     8.81   0.000     .4188879    .6625037
------------------------------------------------------------------------------

. regress y z

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.67
       Model |  .063926461         1  .063926461   Prob > F        =    0.4155
    Residual |   9.3697375        98  .095609566   R-squared       =    0.0068
-------------+----------------------------------   Adj R-squared   =   -0.0034
       Total |  9.43366396        99  .095289535   Root MSE        =    .30921

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           z |  -.8671373    1.06047    -0.82   0.416    -2.971605     1.23733
       _cons |   .5406958   .0613807     8.81   0.000     .4188879    .6625037
------------------------------------------------------------------------------

. g t= x/100

. regress y t

      Source |       SS           df       MS      Number of obs   =       100
-------------+----------------------------------   F(1, 98)        =      0.67
       Model |  .063926453         1  .063926453   Prob > F        =    0.4155
    Residual |   9.3697375        98  .095609566   R-squared       =    0.0068
-------------+----------------------------------   Adj R-squared   =   -0.0034
       Total |  9.43366396        99  .095289535   Root MSE        =    .30921

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           t |  -8.671372    10.6047    -0.82   0.416    -29.71605     12.3733
       _cons |   .5406958   .0613807     8.81   0.000     .4188879    .6625037
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10190

17 Jan 2023, 01:23

ADDED IN EDIT: If it is an index and values are continuous within (0,1), then my comment in #2 holds, and Carlo Lazzaro nicely illustrates this in #8. If it is an indicator, see below.

Interpretation of coefficients on indicators is one of the first things you learn in an introductory statistics course. The coefficient represents the expected difference in the outcome between the positive category and the zero category, holding fixed other regressors in the model. Below, the margins output gives you the expected values of the outcome corresponding to each category, and you may verify that the coefficient on foreign is simply the difference.

Code:

sysuse auto, clear
regress price mpg weight i.foreign
margins foreign

Res.:

Code:

. regress price mpg weight i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(3, 70)        =     23.29
       Model |   317252881         3   105750960   Prob > F        =    0.0000
    Residual |   317812515        70  4540178.78   R-squared       =    0.4996
-------------+----------------------------------   Adj R-squared   =    0.4781
       Total |   635065396        73  8699525.97   Root MSE        =    2130.8

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |    21.8536   74.22114     0.29   0.769    -126.1758     169.883
      weight |   3.464706    .630749     5.49   0.000     2.206717    4.722695
             |
     foreign |
    Foreign  |    3673.06   683.9783     5.37   0.000     2308.909    5037.212
       _cons |  -5853.696   3376.987    -1.73   0.087    -12588.88    881.4934
------------------------------------------------------------------------------

. margins foreign

Predictive margins                              Number of obs     =         74
Model VCE    : OLS

Expression   : Linear prediction, predict()

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     foreign |
   Domestic  |   5073.266    320.473    15.83   0.000     4434.103    5712.429
    Foreign  |   8746.326   540.7053    16.18   0.000     7667.924    9824.729
------------------------------------------------------------------------------

Last edited by Andrew Musau; 17 Jan 2023, 01:29.

Comment

Hemanshu Kumar

Join Date: Mar 2015

Posts: 1385
#10

17 Jan 2023, 01:34

Originally posted by Adam Mitchell View Post

Ok that's my concern because in this data it is impossible as nobody has a value of 1 in this variable.

Once the model has estimated the coefficients, it is simply drawing a line that passes through the data. Mechanically, the line can give values of the dependent variable even for values of the independent variable(s) that do not exist in the data or are even completely "impossible". For example, your independent variable x may be binary with the only possible values being 0 and 1, but you can plug in the coefficients and get a predicted value for x = 0.5 or even x = 2 or indeed x = 2,000. Whether those values and predicted values make sense, is for you to decide based on context.

Also remember that the regression may produce unreliable predicted values for x values that are "far" from the actual data. So if your data for the index x is almost always between 0 and 0.2, for instance, then predicted values for x = 1 should be taken with a healthy pinch of salt. Correspondingly, the coefficient interpretation would then be more sensibly made (similar to the suggestions above) for a 0.1 unit change (by dividing the coefficient by 10) than for a unit change.

Last edited by Hemanshu Kumar; 17 Jan 2023, 01:38.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17706
#11

17 Jan 2023, 01:47

Adam:
if your index is something similar to the EQ visual analogue scale (EQ VAS) (https://euroqol.org/support/terminology) score, in my experience it has a descriptive meaning rather than an inferential usage as a predictor.

Kind regards,
Carlo
(Stata 19.0)
Comment
Adam Mitchell

Join Date: Oct 2021

Posts: 56
#12

17 Jan 2023, 02:01

It is actually an index of frailty. And we have individuals ranging from a score of 0.01 to 0.74. We have this measured at 3 stages and looking at the change in this index over time in relation to expressions of certain biomarkers.
Comment

Announcement