Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question about calculating beta post analysis

    The formula to calculate beta is: b*(s_x)/s_y)

    After running the following code

    Code:
    sysuse nlsw88, clear
    sum union wage
    reg union wage hours, beta
    We see that for x = wage:

    b = .0017124
    sd_x = 5.755523
    sd_y = 0.4304825
    Code:
    . di (.0150858)*(5.755523/.4304825)
    .20169616
    However, this is clearly wrong. -reg, beta- says the actual standardized coefficient is .1460648. What am I missing? I think the issue is differences in SD calculation for binary? For instance, these don't match:

    Code:
    . di sqrt(.2454739*(1-.2454739)/1878)
    .00993098
    
    . sum union
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
           union |      1,878    .2454739    .4304825          0          1
    Last edited by Hutchins Yeo; 13 May 2020, 10:17.

  • #2
    Let's look at the actual output, rather than your report of the output.
    Code:
    . sysuse nlsw88, clear
    (NLSW, 1988 extract)
    
    . sum age wage
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             age |      2,246    39.15316    3.060002         34         46
            wage |      2,246    7.766949    5.755523   1.004952   40.74659
    
    . reg age wage hours, beta
    
          Source |       SS           df       MS      Number of obs   =     2,242
    -------------+----------------------------------   F(2, 2239)      =      2.00
           Model |  37.5163127         2  18.7581563   Prob > F        =    0.1350
        Residual |  20954.3142     2,239  9.35878258   R-squared       =    0.0018
    -------------+----------------------------------   Adj R-squared   =    0.0009
           Total |  20991.8305     2,241  9.36717113   Root MSE        =    3.0592
    
    ------------------------------------------------------------------------------
             age |      Coef.   Std. Err.      t    P>|t|                     Beta
    -------------+----------------------------------------------------------------
            wage |   -.017124    .011369    -1.51   0.132                -.0322135
           hours |  -.0066185   .0062286    -1.06   0.288                -.0227258
           _cons |     39.532   .2433198   162.47   0.000                        .
    ------------------------------------------------------------------------------
    So we see that your "x" and "y" correspond to wage (sd = 5.755523) and age (sd = 3.060002), reversed from what you transcribed in post #1.

    I think perhaps you intended for wage, not age, to be the independent variable, and didn't notice that you'd reversed them in your regression command.
    Last edited by William Lisowski; 13 May 2020, 10:26.

    Comment


    • #3
      Originally posted by William Lisowski View Post
      Let's look at the actual output, rather than your report of the output.
      Code:
      . sysuse nlsw88, clear
      (NLSW, 1988 extract)
      
      . sum age wage
      
      Variable | Obs Mean Std. Dev. Min Max
      -------------+---------------------------------------------------------
      age | 2,246 39.15316 3.060002 34 46
      wage | 2,246 7.766949 5.755523 1.004952 40.74659
      
      . reg age wage hours, beta
      
      Source | SS df MS Number of obs = 2,242
      -------------+---------------------------------- F(2, 2239) = 2.00
      Model | 37.5163127 2 18.7581563 Prob > F = 0.1350
      Residual | 20954.3142 2,239 9.35878258 R-squared = 0.0018
      -------------+---------------------------------- Adj R-squared = 0.0009
      Total | 20991.8305 2,241 9.36717113 Root MSE = 3.0592
      
      ------------------------------------------------------------------------------
      age | Coef. Std. Err. t P>|t| Beta
      -------------+----------------------------------------------------------------
      wage | -.017124 .011369 -1.51 0.132 -.0322135
      hours | -.0066185 .0062286 -1.06 0.288 -.0227258
      _cons | 39.532 .2433198 162.47 0.000 .
      ------------------------------------------------------------------------------
      So we see that your "x" and "y" correspond to wage (sd = 5.755523) and age (sd = 3.060002), reversed from what you transcribed in post #1.

      I think perhaps you intended for wage, not age, to be the independent variable, and didn't notice that you'd reversed them in your regression command.
      Actually, I used union instead of age. I originally had age accidentally.

      Comment

      Working...
      X