Rescaling of regressors in Stata

Mohamud Hussein

Join Date: Jan 2015
Posts: 99

Rescaling of regressors in Stata

25 Nov 2015, 05:47

Hi Members,

I have the data summarized below which includes variables with different (within mean value) scales. I would like to re-scale all variables listed to the first (my dependent variable) so that the regressor's response to the changes in this is comparable on a common scale.

Question: Is there a STATA command(s) for this sort of operation? I no reference in the manual!

Thanks,
Mohamud

Code:

 xtsum pia_costs Totincost Piaunitcost subsidy process_speed agr_score enforcement if pia_costs>0

Variable         |      Mean   Std. Dev.       Min        Max |    Observations
-----------------+--------------------------------------------+----------------
pia_co~s overall |  50801.61   61968.65      210.4   354040.9 |     N =     491
         between |             53958.79     1437.5   303377.3 |     n =      98
         within  |             21950.62  -81134.61   194839.8 | T-bar =  5.0102
                 |                                            |
Totinc~t overall |  143319.7   156315.5    1373.07   950410.6 |     N =     491
         between |             140093.7    4739.29   747171.7 |     n =      98
         within  |             41196.83  -107291.1   438458.7 | T-bar =  5.0102
                 |                                            |
Piauni~t overall |  .1363377   .5509344   .0000548   8.044555 |     N =     490
         between |             .6128256   .0007528    4.75892 |     n =      98
         within  |             .3175921  -2.841012   4.802553 | T-bar =       5
                 |                                            |
subsidy  overall | -33124.19   39201.34  -273639.7          0 |     N =     491
         between |             36999.17  -223194.5          0 |     n =      98
         within  |             21912.18  -200846.6   52923.73 | T-bar =  5.0102
                 |                                            |
proces~d overall |  1896.392   1977.406   8.549847   9890.264 |     N =     484
         between |             1717.602   25.36323   6799.553 |     n =      95
         within  |             777.8839  -1739.431   5866.714 | T-bar = 5.09474
                 |                                            |
agr_sc~e overall |  67.89054   15.01818      31.25       92.2 |     N =     501
         between |             10.55253      31.25      88.15 |     n =     113
         within  |             13.02282   33.34887   92.59887 | T-bar = 4.43363
                 |                                            |
enforc~t overall |  10.11554   17.52622          0        164 |     N =     502
         between |             10.23351          0       67.5 |     n =     113
         within  |             13.58242  -55.38446   106.6155 | T-bar = 4.44248

Tags: None

Ariel Karlinsky

Join Date: Jun 2015

Posts: 491
#2

25 Nov 2015, 08:01

As far as I know, there is no such command nor is there any sense in it, as stata cannot know what the variable represents. is it dollars? euros? shekels? thousands of dollars? years? days? amount of degrees? length in KM? Miles? Area? Volume? etc etc.
This is usually what variable labels or notes are for.
Comment
Roman Mostazir

Join Date: Apr 2014

Posts: 874
#3

25 Nov 2015, 12:37

It is not difficult to derive standardized variable and run the model. For example,

Code:

su totincome gen totinicomestd = (totincome-r(mean)) / r(sd)

You now have 'totinicomestd' as a standardized variable with mean `zero' and 1 sd. You can do the same for all variables. However, this approach will not be suitable in your case as you have repeated measures. For example, the calcualtion for your standardized variable taking the overall mean while you have different means for each time points and different SDs. You are assuming a common mean and variance structure for all time points, which is not true. On the otherhand, if you standardize for each time point, interpretation gets confusing. Why not just do the multi-level model and interpret the coefficient as they are. Further you can use post estimation 'margins' command for customized predictions where you will be able to predict for each of the predictor variables standardized values (i.e. at -1sd, at mean, at mean+1sd) and compare the effects.

Roman
Comment
Mohamud Hussein

Join Date: Jan 2015

Posts: 99
#4

26 Nov 2015, 05:47

Thanks Roman. that is very helpful.
Comment

Announcement

Rescaling of regressors in Stata

Comment

Comment

Comment