Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • question about reg a b c i.year, robust

    i have a question about this command reg a b c i.year, robust.
    what is the diffrence between using normal command reg a b c and reg a b c i.year, robust ?
    forgive me if this is a stupid question lol, i've just started to use stata and face many difficulties

  • #2
    There are two differences. -reg a b c i.year, robust- contains the additional set of year indicator variables that is not incuded in -reg a b c-, so it contains adjustment for year effects on a that are not included in -reg a b c-. Then there is the matter of the use of the -robust- option. That causes Stata to calculate standard errors using the Huber-White sandwich estimator which is robust to violations of homoscedasticity.

    Comment


    • #3
      Thank you for the answer Mr Clyde it's really help me, but i'd like to ask one more question.

      this is my result for the calculation :

      Linear regression Number of obs = 165
      F( 8, 156) = 10.47
      Prob > F = 0.0000
      R-squared = 0.3667
      Root MSE = 1.7748

      ------------------------------------------------------------------------------
      | Robust
      rktp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      lnkukm | .8444457 .1041672 8.11 0.000 .6386855 1.050206
      lnpdrb | -1.119184 .1500822 -7.46 0.000 -1.415639 -.8227281
      sbriil | -.1073237 .2467111 -0.44 0.664 -.594649 .3800017
      tpt | -.1463694 .0815211 -1.80 0.075 -.307397 .0146581
      |
      Periode |
      2012 | .2688198 .4524748 0.59 0.553 -.6249481 1.162588
      2013 | .772571 .5743621 1.35 0.181 -.3619593 1.907101
      2014 | .8103533 .4502405 1.80 0.074 -.0790011 1.699708
      2015 | 1.298018 .4921571 2.64 0.009 .3258662 2.27017
      |
      _cons | 21.12145 2.428598 8.70 0.000 16.32427 25.91862
      ------------------------------------------------------------------------------

      my periods is consist of 5 year periods which is 2011 to 2015, but why is the result only showing from 2012 to 2015 ?
      pardon me if this is a stupid question haha, iam really new to this statistic world.

      Comment


      • #4
        In this case 2011 is the reference level. You can change it. Please take a look at the manual.

        This is a toy example

        Code:
        sysuse auto
        (1978 Automobile Data)
        
        . tab rep78
        
        Repair |
        Record 1978 | Freq. Percent Cum.
        ------------+-----------------------------------
        1 | 2 2.90 2.90
        2 | 8 11.59 14.49
        3 | 30 43.48 57.97
        4 | 18 26.09 84.06
        5 | 11 15.94 100.00
        ------------+-----------------------------------
        Total | 69 100.00
        
        . regress mpg i.rep78
        
        Source | SS df MS Number of obs = 69
        -------------+---------------------------------- F(4, 64) = 4.91
        Model | 549.415777 4 137.353944 Prob > F = 0.0016
        Residual | 1790.78712 64 27.9810488 R-squared = 0.2348
        -------------+---------------------------------- Adj R-squared = 0.1869
        Total | 2340.2029 68 34.4147485 Root MSE = 5.2897
        
        ------------------------------------------------------------------------------
        mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        rep78 |
        2 | -1.875 4.181884 -0.45 0.655 -10.22927 6.479274
        3 | -1.566667 3.863059 -0.41 0.686 -9.284014 6.150681
        4 | .6666667 3.942718 0.17 0.866 -7.209818 8.543152
        5 | 6.363636 4.066234 1.56 0.123 -1.759599 14.48687
        |
        _cons | 21 3.740391 5.61 0.000 13.52771 28.47229
        ------------------------------------------------------------------------------
        
        . regress mpg ib3.rep78
        
        Source | SS df MS Number of obs = 69
        -------------+---------------------------------- F(4, 64) = 4.91
        Model | 549.415777 4 137.353944 Prob > F = 0.0016
        Residual | 1790.78712 64 27.9810488 R-squared = 0.2348
        -------------+---------------------------------- Adj R-squared = 0.1869
        Total | 2340.2029 68 34.4147485 Root MSE = 5.2897
        
        ------------------------------------------------------------------------------
        mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
        -------------+----------------------------------------------------------------
        rep78 |
        1 | 1.566667 3.863059 0.41 0.686 -6.150681 9.284014
        2 | -.3083333 2.104836 -0.15 0.884 -4.513226 3.896559
        4 | 2.233333 1.577087 1.42 0.162 -.9172607 5.383927
        5 | 7.930303 1.86452 4.25 0.000 4.205497 11.65511
        |
        _cons | 19.43333 .9657648 20.12 0.000 17.504 21.36267
        
        . regress mpg i.rep78, base
        
              Source |       SS           df       MS      Number of obs   =        69
        -------------+----------------------------------   F(4, 64)        =      4.91
               Model |  549.415777         4  137.353944   Prob > F        =    0.0016
            Residual |  1790.78712        64  27.9810488   R-squared       =    0.2348
        -------------+----------------------------------   Adj R-squared   =    0.1869
               Total |   2340.2029        68  34.4147485   Root MSE        =    5.2897
        
        ------------------------------------------------------------------------------
                 mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
               rep78 |
                  1  |          0  (base)
                  2  |     -1.875   4.181884    -0.45   0.655    -10.22927    6.479274
                  3  |  -1.566667   3.863059    -0.41   0.686    -9.284014    6.150681
                  4  |   .6666667   3.942718     0.17   0.866    -7.209818    8.543152
                  5  |   6.363636   4.066234     1.56   0.123    -1.759599    14.48687
                     |
               _cons |         21   3.740391     5.61   0.000     13.52771    28.47229
        ------------------------------------------------------------------------------
        Last edited by Marcos Almeida; 06 Dec 2018, 21:05.
        Best regards,

        Marcos

        Comment


        • #5
          Marcos is 100% correct. Just amplifying what he said, in case you are not familiar with reference categories, in regression, when a categorical variable is used, it is represented by separate 0/1 variables for each level (value the variable takes on in the data) except 1. So, for example, sex (male/female) would be represented by either a variable female = 1 & male = 0 or a variable male = 1 & female = 0, but not both. A variable with three values, say orange, apple or banana could be represented by any two of orange = 1 & apple/banana = 0, apple = 1 & orange/banana = 0, or banana = 1 & orange/apple = 0. As Marcos points out, you can control which level gets omitted (and is called the reference category). If you don't tell Stata which one to use, Stata will pick one, usually the lowest numbered one.

          This is a basic aspect of regression with categorical variables. If that is new to you (and we were all beginners once) you should pick up an elementary textbook on regression (or general statistics with a chapter on regression) and read about the use of "dummy" or "indicator" variables to represent categorical variables.

          Comment

          Working...
          X