Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • set industry dummies

    Dear Statalist,

    In my panel data, I want to control for time dummies and industry dummies. Industry dummies are set according to 2-digit sic code(1-99). However, my sic codes are 4 digit. How do I set industry dummies according to the first two digits sic code? Do you have any suggestions for the code?
    For example, when sic code is 3826, the first two digits code is 38, I set industry dummies based on the first two digit sic code.

    Also, when including time dummies, it is ok to use code: reg totalstr femaleratio outsiders1 roe logasset RDintensity risk i.year, robust.
    i.year represents year dummies?

    Thanks for any helps.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double year str10 Ticker float(totalcsr totalstr totalcon)    double NumberDirectors float(femaleratio    outsiders1)    str6    gvkey    str4    sic    float(logasset    roe    RDintensity    risk)
    2002 "A"     3  6 -3  8 .125      .875 "126554" "3826"  9.012256    -.2230387   .19450915    .14019261
    2007 "A"     9 11 -2  9 .111  .8888889 "126554" "3826"  8.929832    .1972789   .12638377    .27627748
    2008 "A"     6  8 -2  9 .111  .8888889 "126554" "3826"  8.914223    .2708089   .12192588     .2857335
    2009 "A"     6  8 -2  9 .111  .8888889 "126554" "3826"  8.937481    -.012370312    .1432716     .3815029
    2010 "A"     4  6 -2  8 .125      .875 "126554" "3826"  9.179469    .21137206   .11202636    .22586633
    2011 "A"     5  5  0  9 .111  .8888889 "126554" "3826"  9.111293    .23447637   .09794748    .21331567
    2012 "A"     2  2  0  9 .111  .8888889 "126554" "3826"  9.262553    .22237223    .0975503    .20045558
    2013 "A"     7  7  0  9 .111  .8888889 "126554" "3826"   9.27669    .13688788   .10395163    .25257346
    2004 "AAON"  0  0  0  8    0      .875 "021542" "3585"   4.65612    .10567506  .006186983   .001587045
    2005 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  4.732736    .14418517 .0090769185 .00051933876
    2006 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  4.867965    .18705782  .008528471            0
    2007 "AAON"  0  0  0  6    0  .8333333 "021542" "3585"  4.921002    .2426745  .009458435            0
    2008 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  4.946936    .29619154   .00921262            0
    2009 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  5.051208    .2349257  .012638514            0
    2010 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  5.076903    .1875466  .014720796            0
    2011 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"   5.18728    .1141677    .0180302            0
    2012 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  5.265241    .19870996   .01187672            0
    2013 "AAON"  0  0  0  7    0  .8571429 "021542" "3585"  5.372701    .22879724  .016192315            0
    2002 "AAP"   0  0  0 11 .091  .9090909 "145977" "5531"  7.583362    .13882388           0      .368829
    2004 "AAP"   0  0  0  7 .143  .8571429 "145977" "5531"  7.697104    .26025763           0    .19904976
    2006 "AAP"   0  0  0 10   .1        .9 "145977" "5531"  7.894572    .22439453           0     .1778717
    2007 "AAP"  -2  0 -2  9 .111  .8888889 "145977" "5531"  7.939361    .23277804           0    .18002144
    2009 "AAP"  -2  0 -2  9 .111  .8888889 "145977" "5531"  8.030397    .21083935           0    .06603626
    2010 "AAP"   0  0  0  9 .333  .8888889 "145977" "5531"  8.117973    .3329437           0    .08969336
    2012 "AAP"   0  0  0 10   .2        .9 "145977" "5531" 8.4368105    .3202048           0    .13101113
    2013 "AAP"   0  0  0  9 .111  .8888889 "145977" "5531"  8.624211    .25838062           0    .18916635
    2002 "AAPL"  1  1  0  5    0        .8 "001690" "3663"  8.747988    .015873017   .07784744    .05017466
    2003 "AAPL"  0  0  0  6    0  .8333333 "001690" "3663"  8.826881    .016339095   .07588207            0
    2004 "AAPL"  1  1  0  8    0      .875 "001690" "3663"  8.993427    .05437352    .0590651            0
    2005 "AAPL"  2  2  0  7    0  .8571429 "001690" "3663"  9.354527    .1788106   .03833178            0
    2006 "AAPL"  2  2  0  8    0      .875 "001690" "3663"  9.752955    .19921875   .03686254            0
    2007 "AAPL"  1  2 -1  7    0  .8571429 "001690" "3663" 10.140415    .24057253   .03257519            0
    2008 "AAPL" -1  1 -2  8 .125      .875 "001690" "3663" 10.585877    .2298621   .03414514            0
    2009 "AAPL" -1  1 -2  7 .143  .8571429 "001690" "3663" 10.768506    .26027182   .03106864            0
    2010 "AAPL" -1  5 -6  6 .167  .8333333 "001690" "3663"  11.22768    .2932142  .027320813            0
    2011 "AAPL"  1  4 -3  8 .125      .875 "001690" "3663" 11.664538    .3383411  .022439007            0
    2012 "AAPL" -2  4 -6  8 .125      .875 "001690" "3663" 12.078603    .3530412   .02160273            0
    2013 "AAPL"  2  8 -6  8 .125      .875 "001690" "3663" 12.240474    .2997758   .02618337    .08193237
    2009 "AATI"  0  0  0  6    0  .8333333 "163679" "3674"  4.991412    -.09597123    .3175051            0
    2010 "AATI"  0  0  0  7    0  .8571429 "163679" "3674"  4.933193    -.10421536    .3224078            0
    2009 "ABAT" -1  0 -1 11 .273 .54545456 "002831" "3690"  5.061493    .1619016  .005474969            0
    2010 "ABAT"  0  0  0 11 .273 .54545456 "002831" "3690"  5.427581    .17897487  .002110595            0
    2004 "ABAX" -1  0 -1  6    0  .8333333 "024888" "3845"  4.125488    .4403907   .10148484 .00025848977
    2005 "ABAX" -1  0 -1  6    0  .8333333 "024888" "3845" 4.2628064    .07866444   .09761553            0
    2006 "ABAX" -1  0 -1  6    0  .8333333 "024888" "3845"   4.41978    .10522537   .08888986            0
    2007 "ABAX"  0  0  0  6    0  .8333333 "024888" "3845"  4.631958    .11471097   .07167628            0
    2008 "ABAX"  0  0  0  6    0  .8333333 "024888" "3845"  4.794989    .11947558   .06927828            0
    2009 "ABAX"  0  0  0  6    0  .8333333 "024888" "3845"  4.946708    .09467106   .07920464            0
    2010 "ABAX"  0  0  0  5    0        .8 "024888" "3845"  5.122868    .08852018    .0858081            0
    2011 "ABAX"  0  0  0  6    0  .8333333 "024888" "3845"  5.237824    .08619729   .08333334   .003962605
    2012 "ABAX"  1  1  0  6    0  .8333333 "024888" "3845"  5.203105    .08192884   .07820123   .004306078
    2013 "ABAX"  1  1  0  6    0  .8333333 "024888" "3845"  5.307094    .15584525  .072984815  .0033802036
    2013 "ABBV"  1  4 -3  9 .111  .8888889 "016101" "2836" 10.281856    .9189671    .1699308     .4894856
    2004 "ABC"  -3  0 -3  9 .111  .7777778 "031673" "5122"  9.363405    .10794771           0    .09928872
    2005 "ABC"  -3  0 -3 10   .1        .8 "031673" "5122"  9.339716    .06182779           0    .08360113
    2006 "ABC"  -2  0 -2 10   .1        .8 "031673" "5122"  9.455943    .11294283           0    .08557086
    2007 "ABC"  -2  0 -2  9 .111  .8888889 "031673" "5122"  9.418173    .15135786           0    .09969875
    2008 "ABC"  -1  0 -1  9 .111  .8888889 "031673" "5122"  9.410648    .09245566           0    .09718717
    2009 "ABC"  -1  0 -1  8 .125      .875 "031673" "5122"  9.515819    .185313           0      .086713
    2010 "ABC"  -1  0 -1  9 .222  .8888889 "031673" "5122"    9.5774    .21553284           0    .09304971
    2011 "ABC"  -1  0 -1  9 .222  .8888889 "031673" "5122"   9.61465    .2464803           0    .06493255
    2012 "ABC"   0  0  0  9 .222  .8888889 "031673" "5122"  9.644984    .2926619           0    .09367768
    2013 "ABC"   1  1  0 10   .2        .9 "031673" "5122"  9.847903    .1869632           0     .0738217
    2013 "ABC"   1  1  0 10   .2        .9 "031673" "5122"  9.847903    .1869632           0     .0738217
    2013 "ABC"   0  0  0 10   .2        .9 "031673" "5122"  9.847903    .1869632           0     .0738217
    2010 "ABCD"  0  0  0  8    0       .75 "176939" "7370"  5.948197    -.12861659   .05824782    .42595455
    2011 "ABCD"  0  0  0  9    0  .7777778 "176939" "7370"  5.912638    -.54891753    .0576635    .50437945
    2007 "ABG"  -2  0 -2 11 .182  .9090909 "144640" "5500"  7.609019    .0872181           0    .23501016
    2008 "ABG"  -2  0 -2 11 .182  .8181818 "144640" "5500"  7.411133    -1.517737           0     .3314393
    2009 "ABG"  -2  0 -2 11 .182  .8181818 "144640" "5500"   7.24487    .05500821           0     .3774716
    2010 "ABG"  -1  0 -1 11 .182  .8181818 "144640" "5500"  7.304045    .13270637           0      .359887
    2011 "ABG"  -1  0 -1 10   .2        .9 "144640" "5500"  7.257989    .20789957           0     .3093561
    2012 "ABG"   0  0  0  9 .222  .7777778 "144640" "5500"  7.415416    .2040715           0     .2777176
    2013 "ABG"   2  2  0  9 .222  .7777778 "144640" "5500"  7.543591    .22238076           0     .2876734
    2013 "ABG"   0  2 -2  9 .222  .7777778 "144640" "5500"  7.543591    .22238076           0     .2876734
    2009 "ABII"  0  0  0  8    0      .625 "178855" "2836"  6.973899    -.12184836    .4306225            0
    2004 "ABMD"  1  1  0  8 .125      .875 "013619" "3841" 4.0802627    -.17384423    .5391429            0
    2005 "ABMD"  1  1  0  9 .111  .7777778 "013619" "3841"  4.111873    -.04168817    .3458761            0
    2006 "ABMD"  1  1  0  9 .111  .8888889 "013619" "3841" 4.3635697    -.4237998    .6880009            0
    2007 "ABMD"  1  1  0  8 .125      .875 "013619" "3841"     4.914    -.22835498    .4525854            0
    2008 "ABMD"  0  0  0  8 .125      .875 "013619" "3841"  4.770947    -.4373144    .4171629            0
    2009 "ABMD"  0  0  0  9 .111  .8888889 "013619" "3841"  4.912346    -.27242786   .33942086            0
    2010 "ABMD"  0  0  0  9 .111  .8888889 "013619" "3841"  4.864221    -.17621994   .29673445            0
    2011 "ABMD"  0  0  0  8 .125      .875 "013619" "3841"  4.879676    -.11222707   .25442162            0
    2012 "ABMD"  0  0  0  7 .143  .8571429 "013619" "3841"  5.036375    .011837178    .2034817            0
    2013 "ABMD"  1  1  0  7 .143  .8571429 "013619" "3841"  5.135793    .10952728    .1621955            0
    2004 "ABR"   0  0  0  7    0  .7142857 "160173" "6798"  6.815963    .09542422           0    .18170877
    2005 "ABR"   0  0  0  9 .111  .7777778 "160173" "6798"   7.24142    .1751927           0     .3705381
    2006 "ABR"   0  0  0  9 .111  .7777778 "160173" "6798"  7.698186    .17025372           0     .6246259
    2007 "ABR"   0  0  0 10   .1        .8 "160173" "6798"  7.972981    .21386772           0     .6591153
    2008 "ABR"  -1  0 -1 10   .1        .8 "160173" "6798"  7.855248    -.28906855           0    .55695254
    2009 "ABR"  -1  0 -1 10   .1        .8 "160173" "6798"  7.630837    -2.3382268           0     .8148192
    2010 "ABR"   0  0  0 10   .1        .8 "160173" "6798"  7.456575    .5470048           0     .7250463
    2011 "ABR"   0  0  0 10   .1        .8 "160173" "6798"  7.482521    -.2329352           0     .6160958
    2012 "ABR"   0  0  0  8 .125      .875 "160173" "6798"  7.439489    .09297287           0     .5821112
    2001 "ABT"  -3  1 -4 14 .143  .7857143 "001078" "3845" 10.056055    .17113546    .1785636    .18610123
    2002 "ABT"  -2  1 -3 14 .143  .7857143 "001078" "3845" 10.096547    .26196158   .09440338     .1761802
    2003 "ABT"  -3  1 -4 13 .154  .7692308 "001078" "3845" 10.192993    .2106165   .09317376    .12922646
    2004 "ABT"  -6  1 -7 13 .154  .7692308 "001078" "3845"    10.267    .22587603   .10039417    .16643555
    2005 "ABT"  -5  1 -6 14 .143  .7857143 "001078" "3845" 10.279908    .2339231   .08248034    .15687424
    end

  • #2
    To get the first two digits of the sic code, all you need is:
    Code:
    gen sic2 = substr(sic, 1, 2)
    destring sic2, replace
    I have taken the second step of converting the two digits into an actual numeric variable so that it can be easily used as a factor variable (set of "dummy" variables) in your modeling.

    As for the regression you are proposing, I see problems. I'm not sure what your panel variable is. None of the obvious candidates (Ticker, gvkey) jointly identify unique observations with year. In fact, you have some observations in your example data that are complete duplicates on all of the variables. Fully duplicate observations like that are usually data errors in any data set. And in a panel data set, you must never have two observations that have the same values for the panel and time variable. So I think the data set you are working with needs some rehab before you can use it for anything. And remember that a data set that contains one error is likely to contain others you haven't found yet. So don't just do some quick fix of this immediate problem: carefully review how the data set was created and scrutinize it for other possible errors. Better to find them now than when you are doing a presentation in front of a group and somebody in the audience points out that some of your results can't possibly be right.

    Once you get the data fixed, your regression equation doesn't take into account the nesting of observations within panels (whatever your panel is). That needs to be accounted for. You should be using -xtreg-, not -reg- in order to do that. (And that in turn requires you to -xtset- your data first.) Moreover, you appear to be working with economic or financial data, and in those disciplines one ordinarily uses cluster robust variance estimation unless the data has too few panels to do that.

    Comment


    • #3
      Thanks, Clyde.

      I will fix my duplicates issue.

      Comment

      Working...
      X