Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Industry Fixed Effects (dummy)

    Hi, please help.
    I'm doing panel data now and want to run an OLS regression with year and industry dummies.
    I have my industry data, which is the two digits of SIC code, but I don't know how to transform this two digits of SIC code into an industry dummy variable.
    Can someone please tell me the step and the command to create an industry dummy based on those 2 SIC codes?
    Thank you very much.

  • #2
    Windi:
    welcome to this forum.
    Have you already taken a look at -xtset- and -xtreg- in Stata .pdf manual?
    If this reply is not helpful, please provide an example/excerpt of your data via -dataex- (see the FAQ for further details). Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo, thank you very much for replying.
      No, i haven't.
      But as far as I know, the -xtset- and -xtreg- commands are used when i want to set my panel data and do the regression, right?
      What I asked for help is that the way to transfer my industry code into an industry dummy, then i will run the regression.
      This is the -dataex- :


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str20 CapitalIQID float year double(TotalESGScore IO) float(Tobin LogCap) byte SICCodes
      ""                     2017  49.68000000000001     .         .         .  .
      ""                     2018 61.051403508771905     .         .         .  .
      "IQ000000000000000013" 2013                 73     .         .         .  .
      "IQ000000000000000013" 2014             74.375     .         .         .  .
      "IQ10004874"           2012                 51     .         .         .  .
      "IQ10004874"           2013                 51     .         .         .  .
      "IQ10004874"           2014 50.333333333333336     .         .         .  .
      "IQ10004874"           2015                 50     .         .         .  .
      "IQ10004874"           2016  48.57749999999999     .         .         .  .
      "IQ10004874"           2017              48.12     .         .         .  .
      "IQ100058833"          2016              47.48     .         .         .  .
      "IQ1001491"            2016               44.1     .         .         .  .
      "IQ1002291"            2011 49.083333333333336     .         .         .  .
      "IQ1002291"            2012 52.833333333333336     .         .         .  .
      "IQ1002291"            2013 51.166666666666664     .         .         .  .
      "IQ1002291"            2014              52.75     .         .         .  .
      "IQ1002291"            2015 59.166666666666664     .         .         .  .
      "IQ1002291"            2016 59.050833333333344     .         .         .  .
      "IQ1002291"            2017  57.92583333333334     .         .         .  .
      "IQ1002291"            2018              57.21     .         .         .  .
      "IQ100231"             2011 51.083333333333336  57.4 1.2607083  12.09627 48
      "IQ100231"             2012 56.416666666666664 55.57  1.344063   12.1625 48
      "IQ100231"             2013 62.166666666666664 51.75 1.3092988 12.151797 48
      "IQ100231"             2014  65.08333333333333 54.65 1.2762823  12.06814 48
      "IQ100231"             2015              64.25 54.33 1.2078342  12.26288 48
      "IQ100231"             2016 62.359166666666674 56.52  1.331916 12.472953 48
      "IQ100231"             2017              66.19 57.28  1.253622 12.382952 48
      "IQ100231"             2018  67.79499999999999 55.73 1.0250493 12.243917 48
      "IQ1002679"            2011                  .  8.21  1.741335  3.914021 51
      "IQ1002679"            2012                  .  7.36  1.539241  3.797734 51
      "IQ1002679"            2013                  .   5.1 1.3345155 3.4781585 51
      "IQ1002679"            2014                  .   8.8  1.123041 3.4719665 51
      "IQ1002679"            2015                  . 12.34 1.0012181 3.1738784 51
      "IQ1002679"            2016                  . 19.45 1.0329324 3.3322046 51
      "IQ1002679"            2017                  . 20.85   .983365  3.214868 51
      "IQ1002679"            2018                  . 24.57  .8327345  2.985682 51
      "IQ10028455"           2011                  . 90.47  1.597513  5.525453 37
      "IQ10028455"           2012                  . 65.62   1.45141  5.443716 37
      "IQ10028455"           2013                  . 61.66 1.3482863  5.337057 37
      "IQ10028455"           2014                  . 74.16  1.297417  5.259576 37
      "IQ10028455"           2015                  . 72.83 1.0163226  4.416428 37
      "IQ10028455"           2016              44.51 66.15 1.2336217  5.122773 37
      "IQ10028455"           2017                  . 72.87  1.625463  5.792099 37
      "IQ10028455"           2018                  . 76.07  1.143494  5.171052 37
      "IQ10039433"           2011                  .  9.37         . 4.5020294 28
      "IQ10039433"           2012                  . 10.08         .  3.837299 28
      "IQ10039433"           2013                  . 19.71         . 4.3707128 28
      "IQ10039433"           2014                  . 73.14 1.1040114  5.533785 28
      "IQ10039433"           2015                  . 65.25  .7534756  5.314191 28
      "IQ10039433"           2016              52.58 61.71  .9917679  6.016645 28
      "IQ10039433"           2017                  . 80.01  .7450104  5.298817 28
      "IQ10039433"           2018                  .  47.4  .5757679  3.673766 28
      "IQ10040674"           2016              45.71     .         .         .  .
      "IQ1004106"            2015                 52     .         .         .  .
      "IQ1004106"            2016  53.51999999999999     .         .         .  .
      "IQ1004106"            2017  52.50999999999999     .         .         .  .
      "IQ1004106"            2018              52.14     .         .         .  .
      "IQ10050329"           2011                  . 24.14 1.0117568  5.104733 56
      "IQ10050329"           2012                  .  31.1  1.443689  5.465525 56
      "IQ10050329"           2013                  . 29.21         .  5.618587 56
      "IQ10050329"           2014                  . 30.02         .  5.122773 56
      "IQ10050329"           2015                  . 23.34         .  4.995183 56
      "IQ10050329"           2016                  . 21.29         .  4.985659 56
      "IQ10050329"           2017                  . 30.04         .  5.213304 56
      "IQ10050329"           2018                  .  39.1         .  5.212214 56
      "IQ1006484"            2016              46.11     .         .         .  .
      "IQ100787816"          2016              45.13     .         .         .  .
      "IQ100787816"          2017  48.35916666666666     .         .         .  .
      "IQ100787816"          2018 49.784000000000006     .         .         .  .
      "IQ10081196"           2011 45.166666666666664 66.66 1.0817988  8.982549 73
      "IQ10081196"           2012 49.916666666666664 77.26 1.2530625  9.230751 73
      "IQ10081196"           2013                 50 87.61 1.6348026  9.656819 73
      "IQ10081196"           2014 48.916666666666664 90.44  1.760186  9.778451 73
      "IQ10081196"           2015 48.666666666666664 88.41  1.388603  9.892396 73
      "IQ10081196"           2016  48.61916666666667 90.53 1.5826484   10.1197 73
      "IQ10081196"           2017  46.70666666666667 89.43 1.8773127    10.355 73
      "IQ10081196"           2018 47.126000000000005 91.31  1.985606 10.421803 73
      "IQ1008249"            2014                 56     .         .         .  .
      "IQ1008249"            2015               56.5     .         .         .  .
      "IQ1008249"            2016  57.61333333333334     .         .         .  .
      "IQ1008249"            2017 57.940000000000005     .         .         .  .
      "IQ1008249"            2018 61.077999999999996     .         .         .  .
      "IQ10082848"           2011                  .  45.7         .  5.811739 67
      "IQ10082848"           2012                  . 34.12         .  5.953243 67
      "IQ10082848"           2013                  . 31.84         .  5.958425 67
      "IQ10082848"           2014                  .  37.5         .  5.850189 67
      "IQ10082848"           2015                  . 45.88         .  5.798183 67
      "IQ10082848"           2016                  . 39.28         .  5.911338 67
      "IQ10082848"           2017                  .  50.4         .  5.841804 67
      "IQ10082848"           2018                  . 51.37         .  5.700444 67
      "IQ100846258"          2016                 70     .         .         .  .
      "IQ100846258"          2017  69.67999999999999     .         .         .  .
      "IQ100918"             2011                  . 72.55 1.0574446   7.31262 49
      "IQ100918"             2012                  . 70.34 1.0321288  7.273023 49
      "IQ100918"             2013                  .  71.1 1.0778832  7.433903 49
      "IQ100918"             2014                  . 67.74 1.1225846  7.696258 49
      "IQ100918"             2015                  . 69.81 1.1263708  7.697938 49
      "IQ100918"             2016              61.12 71.59 1.1496023  7.850376 49
      "IQ100918"             2017                  . 84.08  1.285128  8.106726 49
      "IQ100918"             2018                  . 79.06 1.1743793   7.93394 49
      end

      the industry data is "SICCodes", it's still not transform into a dummy variable.
      Please help Carlo, thanks.
      Last edited by Windy Indri; 13 Jun 2020, 15:40.

      Comment


      • #4
        several things about your question are unclear to me - what is your outcome variable; I assume the SIC codes are covariates, correct? there are 8 distinct codes; do you want 7 indicator (dummy) variables or do you want to collapse them in some way?, etc.

        if you want 7 indicator variables, just use factor variable notation to get them; see
        Code:
        help fvvarlist

        Comment


        • #5
          Windy:
          thanks for providing a -dataex- example.
          You may want to -label- -SICCodes- and use the resulting variable as a categorical predictor thanks to the -i.- prefix of -fvvarlist- notation, as per RIch's helpful advice.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Hi Rich and Carlo, thanks for the tip!

            I hope i understand it correctly, please review my regression command (attached):




            so i run 3 regressions, OLS without industry dummy, OLS with industry and year dummies, and FE estimator with year dummy.
            am i doing it right?
            Thank you very much.
            Attached Files
            Last edited by Windy Indri; 14 Jun 2020, 07:00.

            Comment


            • #7
              Windy:
              please do not attach images, screenshots and the like, but use CODE delimiters to share what you typed and what Stata gave you back (see the FAQ on this and other posting-related topics). Thanks.
              That said:
              1) if you a panel dataset, why starting off with -regress- instead of -xtreg-?
              2) if you're intetested in -fe- specification, I would focus on -xtreg,fe- with -i.time- (industry -fe- will be automatically captured by Stata via -xtset-) instead of using -regress-.
              Code:
              Code:
              While it'st rue that you'll obatin the same results as far as the predictors are concerned, -xtreg,fe- wlll give you more details about the evidence of the group-wise effect (if any):
              use "https://www.stata-press.com/data/r16/nlswork.dta" . reg ln_wage c.age##c.age i.idcode i.year if idcode<=4 Source | SS df MS Number of obs = 50 -------------+---------------------------------- F(19, 30) = 3.05 Model | 4.79792411 19 .252522321 Prob > F = 0.0031 Residual | 2.48217063 30 .082739021 R-squared = 0.6590 -------------+---------------------------------- Adj R-squared = 0.4431 Total | 7.28009473 49 .148573362 Root MSE = .28764 ------------------------------------------------------------------------------ ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .2364322 .3462718 0.68 0.500 -.4707491 .9436135 | c.age#c.age | -.0056102 .0011932 -4.70 0.000 -.0080471 -.0031734 | idcode | 2 | -.394374 .1194999 -3.30 0.002 -.6384254 -.1503226 3 | .1010589 2.101252 0.05 0.962 -4.19027 4.392388 4 | .5522662 2.131432 0.26 0.797 -3.800699 4.905231 | year | 69 | .224994 .5374605 0.42 0.678 -.8726469 1.322635 70 | .1649135 .7700301 0.21 0.832 -1.407698 1.737525 71 | .1712431 1.09509 0.16 0.877 -2.065229 2.407715 72 | .3136266 1.433598 0.22 0.828 -2.614171 3.241424 73 | .4386164 1.777538 0.25 0.807 -3.1916 4.068833 75 | .575746 2.473349 0.23 0.818 -4.475507 5.626999 77 | .651904 3.179439 0.21 0.839 -5.841377 7.145185 78 | .9314592 3.5311 0.26 0.794 -6.280009 8.142927 80 | .9700889 4.23391 0.23 0.820 -7.676708 9.616886 82 | 1.063147 4.948233 0.21 0.831 -9.042493 11.16879 83 | 1.379563 5.299088 0.26 0.796 -9.442619 12.20175 85 | 1.854813 6.012334 0.31 0.760 -10.42401 14.13364 87 | 2.153058 6.728032 0.32 0.751 -11.58742 15.89353 88 | 2.571633 7.355941 0.35 0.729 -12.4512 17.59447 | _cons | -1.093667 5.581352 -0.20 0.846 -12.49231 10.30497 ------------------------------------------------------------------------------ . xtreg ln_wage c.age##c.age i.year if idcode<=4, fe Fixed-effects (within) regression Number of obs = 50 Group variable: idcode Number of groups = 4 R-sq: Obs per group: within = 0.5342 min = 11 between = 0.0151 avg = 12.5 overall = 0.2227 max = 15 F(16,30) = 2.15 corr(u_i, Xb) = -0.6249 Prob > F = 0.0342 ------------------------------------------------------------------------------ ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .2364322 .3462718 0.68 0.500 -.4707491 .9436135 | c.age#c.age | -.0056102 .0011932 -4.70 0.000 -.0080471 -.0031734 | year | 69 | .224994 .5374605 0.42 0.678 -.8726469 1.322635 70 | .1649135 .7700301 0.21 0.832 -1.407698 1.737525 71 | .1712431 1.09509 0.16 0.877 -2.065229 2.407715 72 | .3136266 1.433598 0.22 0.828 -2.614171 3.241424 73 | .4386164 1.777538 0.25 0.807 -3.1916 4.068833 75 | .575746 2.473349 0.23 0.818 -4.475507 5.626999 77 | .651904 3.179439 0.21 0.839 -5.841377 7.145185 78 | .9314592 3.5311 0.26 0.794 -6.280009 8.142927 80 | .9700889 4.23391 0.23 0.820 -7.676708 9.616886 82 | 1.063147 4.948233 0.21 0.831 -9.042493 11.16879 83 | 1.379563 5.299088 0.26 0.796 -9.442619 12.20175 85 | 1.854813 6.012334 0.31 0.760 -10.42401 14.13364 87 | 2.153058 6.728032 0.32 0.751 -11.58742 15.89353 88 | 2.571633 7.355941 0.35 0.729 -12.4512 17.59447 | _cons | -1.036501 6.663463 -0.16 0.877 -14.64511 12.57211 -------------+---------------------------------------------------------------- sigma_u | .38900632 sigma_e | .28764391 rho | .64651254 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(3, 30) = 8.22 Prob > F = 0.0004 .
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Hi Carlo, okay, will do it for the next post.
                Thank you very much.

                I just want to compare the result since as far as i know, FE estimator is insufficient if reverse causality is present, which is likely to be the case for my model.
                you say:
                2) if you're intetested in -fe- specification, I would focus on -xtreg,fe- with -i.time- (industry -fe- will be automatically captured by Stata via -xtset-) instead of using -regress-.

                But i set my xtset as :companyid year, since my panel data is firm-year observation.
                will stata still able to capture the industry fe?
                Thank you very much.

                Comment


                • #9
                  Windy:
                  1) if you have a source of endogeneity, you can consider -xtivreg-;
                  2) assuming that the industry remain unchanged for each firm as time goes by, -industry- will be wiped out by the -fe- machinery, as time-invariant predictor.
                  As per you -xtset- procedure, -xtreg- will capture the -companyid- -fe-.
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Hi Carlo, okay well noted.
                    Thank you very much.
                    I have another question about winsorization,
                    Command: winsor Tobin, gen(WTobin) p(0.005)

                    is this what it means by winsorize Tobin at the 1st and 99th percentiles?

                    Thank you very much.

                    Comment


                    • #11
                      Windy:
                      sorry, I am not familiar with the community-contributed programme (as per FAQ you're kindly requested to specify) -winsor- by @Nick Cox.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment

                      Working...
                      X