Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • fixed effects model with dummies and sub-sample period

    Dear all,
    I am new to stata so I do lack experience and hope to receive your help. I try to run the fixed effects model with binary industry and year dummies. My dependent variable is firm leverage and independent variables are firm size, profitability, tangibility, growth, non-debt tax shield, growth rate of GDP and inflation rate (the last two are just time-series data). I have 1 country, 597 companies in 15-year-period.My problems are that:

    1. My supervisor said that i can include both year and industry dummies in my model but stata always omits my industry dummies. I googled and some said that the fixed effects model already control for individual effects so industry dummies are not necessary. I do not know which is correct
    2. I separate my full sample in 2 sub-sample periods: before and after financial crisis. The sign of coefficients in both sub-samples contradict with that of the full sample and all of them are significant at 1%. Also, the coefficient in one sub-sample is too big (unreasonable) in comparison with this in others. So, did I make mistakes?

    I created dummies in excel.

    Please help me with those problems as I am kinda desperate. Many thanks in advance.

  • #2
    On 1): A fixed effects models will not estimate coefficients for variables that do not change over time, since they are collinear with the fixed effects.
    Jorge Eduardo Pérez Pérez
    www.jorgeperezperez.com

    Comment


    • #3
      First, do not create dummies in excel. Use the factor notation in Stata. It is much easier to debug and also means that you can use the margins command (which automatically handles interactions for example if they're done using the factor notation). Likewise, instead of putting in the dummies manually, xtset your data and use xtreg.

      If you have set your data to be panel data where the panel is the firm, then xtreg and similar models will automatically put in fixed or random effects for firms. To restate what Jorge said in a different way, since firms almost never change industry, the firm effects also include any industry effects. [That is, if you just add up the dummies for the companies in the industry, you get a variable identical to the industry variable.] So, yes, you cannot estimate a model with firm and industry effects, but you don't need to do so.

      Diagnosing unstable coefficients is much trickier. First, make sure you do it with xtreg and see if anything changes. Do your sample sizes for the two periods add up to the sample size for the full sample estimate? Do you have a variable that is zero for almost all the observations and then non-zero for a few? If it still doesn't make sense, then try regression with dummies and then ask for colinearity estimates. Maybe there is something colinear that is giving unstable effects.

      If you cannot solve the problem, then upload the data and program following the recommendations in the FAQ this list. Someone (maybe me) might take a look at it for you.

      Comment


      • #4
        Originally posted by Phil Bromiley View Post
        First, do not create dummies in excel. Use the factor notation in Stata. It is much easier to debug and also means that you can use the margins command (which automatically handles interactions for example if they're done using the factor notation). Likewise, instead of putting in the dummies manually, xtset your data and use xtreg.

        If you have set your data to be panel data where the panel is the firm, then xtreg and similar models will automatically put in fixed or random effects for firms. To restate what Jorge said in a different way, since firms almost never change industry, the firm effects also include any industry effects. [That is, if you just add up the dummies for the companies in the industry, you get a variable identical to the industry variable.] So, yes, you cannot estimate a model with firm and industry effects, but you don't need to do so.

        Diagnosing unstable coefficients is much trickier. First, make sure you do it with xtreg and see if anything changes. Do your sample sizes for the two periods add up to the sample size for the full sample estimate? Do you have a variable that is zero for almost all the observations and then non-zero for a few? If it still doesn't make sense, then try regression with dummies and then ask for colinearity estimates. Maybe there is something colinear that is giving unstable effects.

        If you cannot solve the problem, then upload the data and program following the recommendations in the FAQ this list. Someone (maybe me) might take a look at it for you.

        Many thanks to both of you. Now, I understand the first problem
        Dear Phil Bromiley,
        I try to generate dummy variables in Stata (version 12.0) and the results do not change (i still have problems with different results from full sample and subsamples). I don't have this type of variable that you mentioned (a variable that is zero for almost all the observations and then non-zero for a few). My two subsample periods (2000-2006 and 2007-2014) add up my full sample (2000-2014).

        Here is the step I did:
        - Declare dataset to be panel data: . xtset COMPANY YEAR
        - Generate year dummies: . tabulate YEAR, gen(y)
        - . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP1 y2 y3 y4 y5 y6 y7 y8 y9 y12 y13 y14, fe (there are more than 1 year dummy omitted as my two variables-GDP and Inflation- are not cross-sectional)

        Then I separate two subsamples in excel and use the similar procedure (as I don't know how to choose subsample in Stata).
        For 2000-2006: . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP y2 y7
        For 2007-2014: . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP y2 y4 y5 y6 y7

        I upload my file and hope that anyone can help me.
        Many thanks in advance.
        Attached Files

        Comment


        • #5
          I would let Stata add the years using "i.YEAR". Also, if you want to run regressions on just the years 2000-2006, you need to restrict the sample. You can do this using if statements. For example, to run a regression using just the years 2000-2006, you can include the statement -if YEAR<=2006- in the regression, but before the comma (you can actually ad a second comma to do it after, but to keep it simple it is better just to add it before the comma in your regressions above).

          You also said that these data come from a single country. As such, I imagine that the GDP and Inflation variables only vary by year. Note that when you include any fixed effects, you cannot include other variables that do not vary within a fixed effect. Let's take the year fixed effects that you are wanting to add as an example. In order to mathematically calculate all of the slope coefficients, all other variables that are included must vary within each year. What I mean by this is that the other variables must take on different values during the same year. With GDP, this is not the case (I assume, anyway, since you said this is from a single country) because all of your observations for a year have the exact same value for GDP. Therefore, two of the coefficients will be dropped (two of the coefficients from among GDP, inflation, and the year dummies). In essence, including the year fixed effects already controls for GDP and inflation, as they are actually already included in the year fixed effects! If you want to include the GDP and inflation variables (by which I mean, if you want to be able to calculate their effects), you will have to drop year fixed effects.

          I hope this helps.

          Comment


          • #6
            Originally posted by Joshua D Merfeld View Post
            I would let Stata add the years using "i.YEAR". Also, if you want to run regressions on just the years 2000-2006, you need to restrict the sample. You can do this using if statements. For example, to run a regression using just the years 2000-2006, you can include the statement -if YEAR<=2006- in the regression, but before the comma (you can actually ad a second comma to do it after, but to keep it simple it is better just to add it before the comma in your regressions above).

            You also said that these data come from a single country. As such, I imagine that the GDP and Inflation variables only vary by year. Note that when you include any fixed effects, you cannot include other variables that do not vary within a fixed effect. Let's take the year fixed effects that you are wanting to add as an example. In order to mathematically calculate all of the slope coefficients, all other variables that are included must vary within each year. What I mean by this is that the other variables must take on different values during the same year. With GDP, this is not the case (I assume, anyway, since you said this is from a single country) because all of your observations for a year have the exact same value for GDP. Therefore, two of the coefficients will be dropped (two of the coefficients from among GDP, inflation, and the year dummies). In essence, including the year fixed effects already controls for GDP and inflation, as they are actually already included in the year fixed effects! If you want to include the GDP and inflation variables (by which I mean, if you want to be able to calculate their effects), you will have to drop year fixed effects.

            I hope this helps.

            Yes, I did help me a lot. Thank you. I can run subsample with if function now. I only use entity (firm) fixed effects model (the results as the same you have mentioned). Please check my pdf file for my results. As you can see, coefficients of GROWTH in full sample is positive while in two supsamples they are negative, and they are all significant. Coefficient of Infl in subsample 2000-2006 is too big in comparison with its results in other parts.

            I really don't if I make any mistake.
            Attached Files

            Comment


            • #7
              You are still including year FE (the y2 y3... variables). Moreover, you seem not to be including all of them (y10 and y11?). I would drop them all. Also, can you please post the regressions you ran in the body of your text in your response to this? I would like to see the entire command (which I cannot because it gets cut off on the pdf file).

              Comment


              • #8
                Originally posted by Joshua D Merfeld View Post
                You are still including year FE (the y2 y3... variables). Moreover, you seem not to be including all of them (y10 and y11?). I would drop them all. Also, can you please post the regressions you ran in the body of your text in your response to this? I would like to see the entire command (which I cannot because it gets cut off on the pdf file).

                Dear Joshua,
                Despite dropping year dummies, i still have problems with growth and inflation variable.
                To be honest, my supervisor said that i can include both year and industry dummies. Since i could not include industry dummies so I at least want to try with year dummies. Below are my regression in pdf file:
                - Full sample: . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP y2 y3 y4 y5 y6 y7 y8 y9 y12 y13 y14,fe
                - 2000-2006: . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP y2 y7 if YEAR<=2006, y2 y7, fe
                - 2007-2014: . xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP y9 y12 y13 y14 if YEAR>=2007, fe
                Many thanks

                Comment


                • #9
                  Trang,

                  I need to slightly change my earlier response and say that if you try to add GDP and Inflation, only those two will be dropped. The reason is that they are collinear with the year FE. In your examples, the coefficients are not being dropped because you have omitted y10 and y11 from the regressions. However, in this case the coefficients on GDP and inflation are NOT the overall effect of GDP and inflation in your sample. It is not a matter of "it would be better not to include GDP and inflation" but rather it is simply impossible--mathematically--to include GDP, inflation, industry FE, and year FE.

                  This is just the way the math works out, unfortunately. If I run the following regression:

                  Code:
                  xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP i.YEAR, fe
                  then both GDP and inflation are dropped from the model. However, if I run

                  Code:
                  xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP, fe
                  then I get coefficients for both GDP and inflation. Unfortunately, it is simply not possible to include year FE and GDP/inflation. There is no way around this. Sorry!

                  Comment


                  • #10
                    Originally posted by Joshua D Merfeld View Post
                    Trang,

                    I need to slightly change my earlier response and say that if you try to add GDP and Inflation, only those two will be dropped. The reason is that they are collinear with the year FE. In your examples, the coefficients are not being dropped because you have omitted y10 and y11 from the regressions. However, in this case the coefficients on GDP and inflation are NOT the overall effect of GDP and inflation in your sample. It is not a matter of "it would be better not to include GDP and inflation" but rather it is simply impossible--mathematically--to include GDP, inflation, industry FE, and year FE.

                    This is just the way the math works out, unfortunately. If I run the following regression:

                    Code:
                    xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP i.YEAR, fe
                    then both GDP and inflation are dropped from the model. However, if I run

                    Code:
                    xtreg LEVTOTAL L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS L.Infl L.GDP, fe
                    then I get coefficients for both GDP and inflation. Unfortunately, it is simply not possible to include year FE and GDP/inflation. There is no way around this. Sorry!

                    Dear Joshua,
                    Many thanks for your quick response. So, I should drop inflation and GDP rather than including them with some (not all) year dummies? Yes, I gave up on industry dummies. I probably use random effects with that case.
                    One question is that: I try running regression without macroeconomic variables; and just include L.PROFITABILITY L.GROWTH L.SIZE L.TANG L.NDTS and year dummies. However, the coefficient of GROWTH in full sample is still positive while these in two sub-samples are negative, and all of them are significant at 1%. It is not reasonable, isn't?
                    Regards.

                    Comment

                    Working...
                    X