Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regress dummy variable correctly

    Hello,

    I have generated a dummy variable "cellar" with has 3 answers: 1 large basement, 2 small basement or 3 crawl space, using:

    Code:
    tab cellar, gen(c)
    To use regression with the dependent variable logprice (=houseprice), should I use:

    Code:
    reg logprice i.cellar
    or

    Code:
    reg logprice c2 c3
    and c1 should be left out and is the base group.

    Which one is correct?

    Thanks in advance.
    Last edited by Mat Sko; 11 May 2019, 17:38.

  • #2
    They will be the same (although it is good practice to use i.cellar so I would use that one).

    Code:
    * I created some toy data
    dataex id price cellar  // Data shared using dataex command. To install: ssc install dataex
    clear
    input byte id int price byte cellar
     1 316 2
     2 252 3
     3 334 3
     4 282 3
     5 304 2
     6 154 1
     7 218 2
     8 422 3
     9 478 3
    10 251 1
    11 193 2
    12 457 1
    13 354 3
    14 189 3
    15 116 3
    16 469 1
    17 372 1
    18 313 2
    19 208 1
    20 119 1
    21 307 3
    22 272 2
    23 457 3
    24 417 3
    25 350 1
    end
    
    gen ln_price = ln(price)
    label define cellar_desc 1 "Large basement" 2 "Small basement" 3 "Crawl Space"
    label values cellar cellar_desc
    tabulate cellar, gen(c)  // creating the indicator variables
    
    . reg ln_price i.cellar
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(2, 22)        =      0.33
           Model |  .115257474         2  .057628737   Prob > F        =    0.7219
        Residual |   3.8334791        22   .17424905   R-squared       =    0.0292
    -------------+----------------------------------   Adj R-squared   =   -0.0591
           Total |  3.94873657        24   .16453069   Root MSE        =    .41743
    
    ------------------------------------------------------------------------------
        ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          cellar |
              2  |  -.0127374   .2254388    -0.06   0.955    -.4802688     .454794
              3  |   .1309974   .1939638     0.68   0.506    -.2712589    .5332537
                 |
           _cons |   5.591398   .1475843    37.89   0.000     5.285326    5.897469
    ------------------------------------------------------------------------------
    
    
    . reg ln_price c2 c3
    
          Source |       SS           df       MS      Number of obs   =        25
    -------------+----------------------------------   F(2, 22)        =      0.33
           Model |  .115257474         2  .057628737   Prob > F        =    0.7219
        Residual |   3.8334791        22   .17424905   R-squared       =    0.0292
    -------------+----------------------------------   Adj R-squared   =   -0.0591
           Total |  3.94873657        24   .16453069   Root MSE        =    .41743
    
    ------------------------------------------------------------------------------
        ln_price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              c2 |  -.0127374   .2254388    -0.06   0.955    -.4802688     .454794
              c3 |   .1309974   .1939638     0.68   0.506    -.2712589    .5332537
           _cons |   5.591398   .1475843    37.89   0.000     5.285326    5.897469
    ------------------------------------------------------------------------------

    Comment


    • #3
      Thanks for the reply David.

      Are they also the same in case of a multiple regression?

      For instance:

      Code:
      reg loglprice i.yearquarter loglot unitsf floors i.cellar

      Comment


      • #4
        Yes (although test it for yourself and see--Stata won't care! :-)

        Comment

        Working...
        X