Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Minhaj uddin View Post
    Hello @Sebastian Kripfganz sir, based on your above suggestion, I have used the following model to estimate system GMM with an orthogonal option.

    1.)
    Could you please help me decide if it is appropriate or not?

    xtabond2 INT L.INT L2.INT ROA LNTA LnTASq DivNIITI MQOETA HHITAFinal Inflation GDP GC CO PV PB c.GC#c.PB c.GC#c.PV c.GC#c.LNTA c.CO#c.PB c.CO#c.PV c.CO#c.LNTA , gmm(INT, lag (1 3) eq (level)) gmm(INT, lag (2 4) collapse eq (diff)) gmm (ROA DivNIITI MQOETA LNTA LnTASq, lag(2 4) collapse ) iv(HHITAFinal Inflation GDP GC CO PV PB c.GC#c.PB c.GC#c.PV c.GC#c.LNTA c.CO#c.PB c.CO#c.PV c.CO#c.LNTA, eq(level)) twostep robust small orthogonal artests(3)


    Where :

    INT is a dependent variable (predetermined)
    ROA LNTA LnTASq DivNIITI MQOETA are firm-specific variables considered endogenous.
    HHITAFinal Inflation GDP are macroeconomic variables considered exogenous.
    GC is a dummy variable for two years (8th and 9th year, where it takes the value 1; and 0 otherwise)
    CO is a one-year dummy (21st year of data)
    PV is a dummy for Private sector firm
    PB is a dummy for the Public sector firm


    By the way my data is unbalanced panel data of 23 years.

    2.) Sir what is the logic of starting one lag before in level and FOD equation compared to the difference equation for lag dependent variable, predetermined and endogenous variable?

    Let's say that the level form of the GMM equation is Yit = Yi,t-1 + Xit + Uit. Then how the Lag(1) of Y as an instrument for the level equation will look like? Will it be (Yit - Yi,t-1) or (Yi,t-1 - Yi,t-2)?

    3. Is it because in the level equation the error term (Uit) is in level form, and the first lag of Yit is in difference (Yit - Yi,t-1)? And since (Y) is a predetermined variable none of the expressions in (Yit-Yi,t-1) will correlate with Uit?

    4. Furthermore, is it okay to have the same lag length for the difference/FOD equation and the level equation? Or is there a criterion to decide this?

    5. Regarding 1st point of post#9 you mentioned that

    "Serial correlation in the error term is often a sign of misspecified dynamics. Adding a second lag of a dependent variable or adding lags of the independent variables as regressors aims to obtain a dynamically complete model, where all the dynamic effects are captured by the right-hand side variables.(Note: This is different from using higher-order lags as instruments. If there is evidence of second-order serial correlation in the first-differenced errors but no higher-order serial correlation, then the third lag onwards of the dependent variable qualifies as a valid instrument. However, this does not address the potential misspecification of the model dynamics and could lead to weak-instruments problems if those higher-order lags are insufficiently correlated with the regressors.) "

    5 B) "Will adding a second lag of a dependent variable or lags of the independent variables take care of misspecification in the model dynamics?"

    5 A) As mentioned by you above, if there's a second-order serial correlation in the first-differenced errors but no higher-order serial correlation, then the third lag onwards of the dependent variable qualifies as a valid instrument. Does this mean that for my model, the instruments for the dependent variable should be "gmm(INT, lag (3 5) eq (level))and gmm(INT, lag (3 5) collapse eq (diff).
    1. With unbalanced panel data, orthogonal deviations are typically recommended compared to first differencing because the latter retains less information if there are gaps in the data. (There is not much of a concern if the data set is only unbalanced because of missing observations at the beginning or end of a time series.)

    2. The first lag used as an instrument for the level model is Y(t-1) - Y(t-2).

    3. Y(t) - Y(t-1) cannot be a valid instrument because it includes Y(t), which is on the left side of your equation and therefore a function of U(t).

    4. There is no widely accepted criterion to determine the maximum lag order for the differenced model. For the level model, typically only the first lag is used without higher-order lags, although further lags are still valid. If you were to use all available lags in the differenced model without collapsing, then the additional lags in the level model would be redundant, which is usually the rational for not using higher-order lags there.

    5.B. It might; this is is certainly the researcher's hope, but there is no guarantee that simply adding the second lag takes care of all omitted dynamics.

    5.A. For the level model, you can start with lag 2. For the first-differenced model, you would indeed need to start with lag 3. This is because the first-differenced model has errors U(t) - U(t-1); the lagged errors require to go deeper with the lags for the instruments.

    Originally posted by Minhaj uddin View Post
    1. My question is if serial correlation for higher-order orders becomes inconsistent and significantly varies as in the following case, what could be the reason for that?

    2. In the following test, do we need both the excluding group and the difference to be insignificant?
    1. There could be different reasons for this. It could genuinely be the case that errors are only correlated over higher-order lags, although this becomes difficult to interpret. It could also be that the second-order test does not reject because of a lack of power or simply by random chance. (Remember that there are both type-1 and type-2 errors possible with statistical testing.) It could also be that the higher-order tests are less reliable due to a small sample size. Last but not least, model/estimator misspecification could also effect the reliability of tests.

    2. The difference test is only meaningful if the excluding test does not reject. So, yes, you would like both tests to be statistically insignificant. Some of the excluding tests may not be very reliable if the excluded instruments - i.e., those to be tested - are necessary to achieve identification; in other words, if the remaining instruments are weak after excluding strong instruments, the test might have poor properties.
    https://twitter.com/Kripfganz

    Comment


    • #17

      Thank you so much, sir, it has been incredibly helpful! I have a few more questions to ask, so please pardon the bother.

      1. In my case the dependent variable is continuous and limited between 0&1. So, can I directly apply the GMM estimation or need some sort of transformation (Like taking log etc.)
      I am asking this because some studies have directly used OLS while others have applied Tobit or logit but mainly in static model. Similarly, recently few papers have directly applied GMM on above dependent variable while some have proposed log transformation.

      What is your take on that?

      2. Related to your 2nd last answer.

      I have around around 100 groups and 1600 observations with a time period of 23 years.

      Is it necessary to test for higher-order correlation at all? If so, in cases where some higher-order autoregressive terms are significant while others are not, what remedial measures could be taken to address this result?"
      ​​​​​​3. Related to your last answer. What should be done if the excluding test reject the null hypothesis as in my case?

      Thank you!

      Last edited by Minhaj uddin; 03 May 2024, 05:50.

      Comment


      • #18
        1. Regarding the log transformation, please see https://www.statalist.org/forums/for...nal-regression. For dynamic models or models with endogenous regressors, estimating a nonlinear model can be challenging. While there are shortcomings of a linear model, it is probably the best starting point.
        2. Taken at face value, higher-order serial correlation still invalidates your instruments, and therefore would be reason for concern. In practice, people rarely test for higher-order serial correlation. [This should not be seen as an endorsement from my side.] If you experience higher-order serial correlation, the only remedy I can see would be to add further lags or even other excludes variables to the model. Yet, there is no guarantee that this will eliminate the serial correlation. You might have to ask yourself if there is a theoretical reason for the serial correlation; then you could also search for external (rather than internal/lagged) instruments.
        3. A rejection of the excluding test signals misspecification. This is an overidentification test for the model without the questionable instruments. Thus, some of the remaining instruments might be invalid; or the regression equation might be misspecified. (Residual serial correlation could be such a form of misspecification.)
        https://twitter.com/Kripfganz

        Comment

        Working...
        X