Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Standardizing categorical variables

    Dear Statalist,

    I am running a linear probability model with categorical variables. Some of the independent variables have two categories, some of them have three categories. I want to compare the relative strength of the effect of these variables on the dependent variable. Is adding the beta option to the regress command and thus standardizing the coefficients of categorical independent variables plausible in tihs context? I am hesitating whether it makes sense or not to standardize categorical variables. Would you have any resource recommendations for this topic?

    Kind regards,
    Elif.

  • #2
    Elif:
    it may well be that I'm missing out on something about your post, but I do not see the need to standardizing categorical variables (also in the light of https://stats.stackexchange.com/ques...ndardization):
    Code:
    . use "https://www.stata-press.com/data/r16/auto.dta"
    (1978 Automobile Data)
    
    . sum mpg
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             mpg |         74     21.2973    5.785503         12         41
    
    . g cat_mpg=0 if mpg<=r(mean)
    (31 missing values generated)
    
    . replace cat_mpg=1 if cat_mpg==.
    (31 real changes made)
    
    . regress foreign i.rep78 i.cat_mpg
    
          Source |       SS           df       MS      Number of obs   =        69
    -------------+----------------------------------   F(5, 63)        =     10.50
           Model |  6.63879662         5  1.32775932   Prob > F        =    0.0000
        Residual |  7.96989903        63  .126506334   R-squared       =    0.4544
    -------------+----------------------------------   Adj R-squared   =    0.4111
           Total |  14.6086957        68   .21483376   Root MSE        =    .35568
    
    ------------------------------------------------------------------------------
         foreign |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
           rep78 |
              2  |   .0303152   .2814261     0.11   0.915    -.5320699    .5927004
              3  |   .1646725   .2609228     0.63   0.530    -.3567401    .6860851
              4  |   .4865266   .2651562     1.83   0.071    -.0433458    1.016399
              5  |   .7851107   .2737032     2.87   0.006     .2381585    1.332063
                 |
       1.cat_mpg |   .2425219   .0926684     2.62   0.011     .0573389    .4277049
           _cons |  -.1212609   .2557343    -0.47   0.637    -.6323051    .3897832
    ------------------------------------------------------------------------------
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Especially if the scaling is nominal, this does not make any sense. I would rather look at changes in R2 to assess the relative importance of a variable, if this is your goal. For more information see https://journals.sagepub.com/doi/abs...6867X211025837
      Best wishes

      Stata 18.0 MP | ORCID | Google Scholar

      Comment


      • #4
        Felix Bittmann, many thanks. Some of the variables with 3 categories are nominal. My goal is to find the variable (or category of a variable) with the highest impact on the dependent variable.

        Comment


        • #5
          Carlo Lazzaro many thanks. In your example, if you were interested with finding the variable with the greatest impact on foreign, would you get the standardized beta coefficients for this regression?

          Comment


          • #6
            Elif:
            again, I do not see the reason why you want to standardize (see toy-example in #2).
            As far as your #5 is concerned, I would go -test-ing the coefficients.
            Last edited by Carlo Lazzaro; 30 Nov 2021, 02:34.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X