Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sargan Test is too sensitive

    Hello Dear Stata Forum Users,

    I have a panel dataset (14,545 observations) and been using Arellano-Bover/Blundell-Bond estimation with xtdpdsys. I do have a problem with Sargan test result and would like to have your opinion on it.

    Here are my codes;
    Code:
    xtset id year
    qui xtdpdsys y, lags(1) twostep
    estat sargan
    Sargan Test Results
    Code:
    Sargan test of overidentifying restrictions
            H0: overidentifying restrictions are valid
    
            chi2(8)      =  9.864916
            Prob > chi2  =    0.2746
    As it is shown here, we are not able to reject the H0.

    However, for some reason, I delete just one observation (the highest value of y) of 14,545 and estimate the model again;
    Code:
    sort y
    drop in 14545
    xtset id year
    qui xtdpdsys y, lags(1) twostep
    estat sargan
    Sargan Test Results
    Code:
    Sargan test of overidentifying restrictions
            H0: overidentifying restrictions are valid
    
            chi2(8)      =  15.85598
            Prob > chi2  =    0.0445
    Right now, our Sargan test value significantly decreases. Shortly, I am wondering why the sargan test is too sensitive (I mean I just drop one observation and the test result becomes stastistically significant?)

    Thank you so much in advance.

    P.S. Here is the link of the example dataset;
    http://www.alperdemirdogen.com/sargan-test-example.dta

  • #2
    Hi,
    I played around with your example dataset. I can't give a full explanation, only some observations. Adding year dummies and increasing the lags from 1 to 2 also changes the result of the Sargan test. Note also that you have only at most 4 observations per group. If you use xtabond2 instead of xtdpdsys then you will see that the Sargan test is always rejected.
    My guess is that this result is just due to your example dataset. With other specifications, I cannot reproduce this behavior of the test. Given the distribution of your y-variable, it might be that the largest observation has really a large impact under certain circumstances.

    Comment


    • #3
      Dear Sven-Kristjan,

      Thank you for your response and effort. I have tried so many different specifications for my full data. As you have mentioned, using 2 lags and adding year dummies also change the results.

      However, I still don't understand how just 1 observation of almost 15 thousand observations changes the results dramatically?

      Comment


      • #4
        Alper:
        have you already checked that the "nasty" observation does not hide a mistaken data entry?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Alper:
          have you already checked that the "nasty" observation does not hide a mistaken data entry?
          Dear Carlo,

          For the simplification of the problem, I just use one variable which is our dependent variable "y". In this case, we just have one data entry for the observation and it seems okay to me. Maybe i miss something? However it is too simple to miss something.

          Comment


          • #6
            Alper:
            whenever something "weird" occurs with a given regression model, I usually go back to square one and use -codebook- or -summarize- to check whether something wrong occurred in data entry.
            That said, if you have already ruled out this possibility, the cause should rest elsewhere.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X