Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results in xtivreg first stage and xtreg only the first stage

    Dear all,

    I am currently run a regression with -xtivreg- to study the long-term effect with IVFE and logged model. However, I find my the results are different in xtivreg2 first stage and only xtreg first stage. The results are like:
    xtivreg disease L1.age L1.age2 L1.inter1 L1.inter2 L1.marital (L1.x = L1.indicator) i.wave if L1.age>=55 & L1.age<=75 & labor_force==1, first fe vce(r)

    First-stage within regression

    Fixed-effects (within) regression Number of obs = 13,607
    Group variable: newid Number of groups = 9,824

    R-squared: Obs per group:
    Within = 0.1613 min = 1
    Between = 0.0201 avg = 1.4
    Overall = 0.0068 max = 2

    F(7,9823) = 55.07
    corr(u_i, Xb) = -0.3467 Prob > F = 0.0000

    (Std. err. adjusted for 9,824 clusters in newid)
    ------------------------------------------------------------------------------
    | Robust
    __000004 | Coefficient std. err. t P>|t| [95% conf. interval]
    -------------+----------------------------------------------------------------

    L1.age | .0313144 .0170861 1.83 0.067 -.0021778 .0648066
    |
    L1.age2 | .0042818 .001345 3.18 0.001 .0016453 .0069182
    |
    L1. inter1| -.0517112 .0138465 -3.73 0.000 -.0788532 -.0245691
    |
    L1.inter2 | -.0072432 .0014281 -5.07 0.000 -.0100425 -.0044438
    |
    L1. marital| -.0054045 .035118 -0.15 0.878 -.074243 .063434
    |
    wave |
    2 | 0 (empty)
    5 | 0 (empty)
    6 | -.1181368 .0220845 -5.35 0.000 -.1614269 -.0748466
    7 | 0 (omitted)
    |
    L1.indicator | .1362696 .0235653 5.78 0.000 .0900768 .1824625
    |
    _cons | .843435 .047912 17.60 0.000 .7495177 .9373523
    -------------+----------------------------------------------------------------
    sigma_u | .48187339
    sigma_e | .18851021
    rho | .86727289 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------

    Fixed-effects (within) IV regression Number of obs = 41,102
    Group variable: newid Number of groups = 24,212

    R-squared: Obs per group:
    Within = 0.0580 min = 1
    Between = 0.0105 avg = 1.7
    Overall = 0.0168 max = 4


    Wald chi2(9) = 6035.37
    corr(u_i, Xb) = 0.0066 Prob > chi2 = 0.0000

    (Std. err. adjusted for 24,212 clusters in newid)
    ------------------------------------------------------------------------------
    | Robust
    disease | Coefficient std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    x |
    L1. | .0696582 .03052 2.28 0.022 .0098401 .1294764
    |
    age |
    L1. | -.0023588 .00477 -0.49 0.621 -.0117078 .0069903
    |
    age2 |
    L1. | .0001673 .0003024 0.55 0.580 -.0004254 .0007599
    |
    inter1 |
    L1. | -.0002404 .003623 -0.07 0.947 -.0073413 .0068605
    |
    inter2 |
    L1. | .0009104 .0005197 1.75 0.080 -.0001083 .0019291
    |
    marital |
    L1. | -.0050013 .0111639 -0.45 0.654 -.0268821 .0168796
    |
    wave |
    5 | .0400923 .0202437 1.98 0.048 .0004152 .0797693
    6 | .0575173 .0257137 2.24 0.025 .0071193 .1079152
    7 | .0726076 .0313591 2.32 0.021 .0111449 .1340703
    |
    _cons | .0428386 .0272541 1.57 0.116 -.0105785 .0962558
    -------------+----------------------------------------------------------------
    sigma_u | .3393069
    sigma_e | .14958866
    rho | .8372669 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    Instrumented: L.x
    Instruments: L.age L.age2 L.inter1 L.inter2 L.marital 5.wave
    6.wave 7.wave L.indicator





    If I only run the first stage with xtreg, the result is following:
    xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave if L1.age>=55 & L1.age<=75 & labor_force==1,fe vce(r)

    Fixed-effects (within) regression Number of obs = 41,124
    Group variable: newid Number of groups = 24,222

    R-squared: Obs per group:
    Within = 0.4045 min = 1
    Between = 0.4217 avg = 1.7
    Overall = 0.4143 max = 4

    F(9,24221) = 669.45
    corr(u_i, Xb) = 0.2598 Prob > F = 0.0000

    (Std. err. adjusted for 24,222 clusters in newid)
    ------------------------------------------------------------------------------
    | Robust
    L.x | Coefficient std. err. t P>|t| [95% conf. interval]
    -------------+----------------------------------------------------------------
    age |
    L1. | .0655449 .0070952 9.24 0.000 .0516379 .0794519
    |
    age2 |
    L1. | .0037481 .0004888 7.67 0.000 .00279 .0047062
    |
    inter1 |
    L1. | -.0500509 .0062653 -7.99 0.000 -.0623313 -.0377705
    |
    inter2 |
    L1. | -.0067018 .0005848 -11.46 0.000 -.0078481 -.0055555
    |
    marital |
    L1. | .0092198 .0173753 0.53 0.596 -.0248368 .0432764
    |
    indicator |
    L1. | .1974026 .0129819 15.21 0.000 .1719572 .2228481
    |
    wave |
    5 | .1459084 .0324698 4.49 0.000 .0822656 .2095512
    6 | .2091353 .0407499 5.13 0.000 .129263 .2890076
    7 | .2750966 .0494694 5.56 0.000 .1781335 .3720597
    |
    _cons | .4403559 .0359106 12.26 0.000 .3699688 .510743
    -------------+----------------------------------------------------------------
    sigma_u | .36985621
    sigma_e | .25082088
    rho | .68497935 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------



    I really cannot figure out why my sample size changes that much and why the results are different. I would appreciate if anyone can help me with it. Thank you so so so so much!

  • #2
    Missing values in the outcome (present in the second-stage but not present in the first-stage) affect the estimation sample used in the first-stage. So you should run xtivreg and use its estimation sample when replicating the first-stage with xtreg.

    Code:
    quietly xtivreg ... , fe
    xtreg ... if e(sample), fe
    Last edited by Andrew Musau; 26 Oct 2023, 08:28.

    Comment


    • #3
      Andrew, thank you so much for replying me! Yes, I tried as you suggested with
      xtreg ... if e(sample), fe,

      but the results are still different with xtivreg first stage, the results is


      quietly xtivreg disease L1.age L1.age2 L1.inter1 L1.inter2 L1.marital (L1.x = L1.indicator) i.wave if L1.age>=55 & L1.age<=75 & labor_force==1, first fe vce(r)

      xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave if e(sample),fe vce(r)

      Fixed-effects (within) regression Number of obs = 41,102
      Group variable: newid Number of groups = 24,212

      R-squared: Obs per group:
      Within = 0.4045 min = 1
      Between = 0.4262 avg = 1.7
      Overall = 0.4174 max = 4

      F(9,24211) = 668.84
      corr(u_i, Xb) = 0.2614 Prob &gt; F = 0.0000

      (Std. err. adjusted for 24,212 clusters in newid)
      ------------------------------------------------------------------------------
      | Robust
      L.retired | Coefficient std. err. t P&gt;|t| [95% conf. interval]
      -------------+----------------------------------------------------------------
      age|
      L1. | .0658998 .007089 9.30 0.000 .0520048 .0797947
      |
      age2 |
      L1. | .0037603 .0004887 7.70 0.000 .0028025 .0047181
      |
      inter1 |
      L1. | -.0500942 .0062661 -7.99 0.000 -.0623762 -.0378122
      |
      inter2 |
      L1. | -.0067139 .0005847 -11.48 0.000 -.00786 -.0055677
      |
      marital |
      L1. | .0092167 .0173765 0.53 0.596 -.0248424 .0432758
      |
      indicator |
      L1. | .1975562 .0129951 15.20 0.000 .172085 .2230274
      |
      wave |
      5 | .1437076 .0324503 4.43 0.000 .0801029 .2073123
      6 | .2064885 .0407266 5.07 0.000 .1266618 .2863151
      7 | .2718594 .0494435 5.50 0.000 .174947 .3687718
      |
      _cons | .4421824 .0358922 12.32 0.000 .3718315 .5125333
      -------------+----------------------------------------------------------------
      sigma_u | .36870559
      sigma_e | .25082651
      rho | .6836234 (fraction of variance due to u_i)
      ------------------------------------------------------------------------------



      Also, I am not sure the missing values in outcome you mentioned. Do you mean the missing values of the disease or x? I am so sorry if I asked stupid questions and thanks for replying me.

      Comment


      • #4
        The IV regression further restricts the sample which you do not do with xtreg. Try

        Code:
        quietly xtivreg disease L1.age L1.age2 L1.inter1 L1.inter2 L1.marital (L1.x = L1.indicator) i.wave if L1.age>=55 & L1.age<=75 & labor_force==1, first fe vce(r)
        
        xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave if e(sample) & L1.age>=55 & L1.age<=75 & labor_force==1,fe vce(r)

        Comment


        • #5
          Thank you, Andrew.
          I am not sure the missing values in outcome you mentioned. Do you mean the missing values of the disease or x? I am so sorry if I asked stupid questions and thanks for replying me.

          I also tried restricts with xtreg, but the result is still different:
          xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave if e(sample) & L1.age>=55 & L1.age<=75 & labor_force==1,fe vce(r)

          Fixed-effects (within) regression Number of obs = 41,102
          Group variable: newid Number of groups = 24,212

          R-squared: Obs per group:
          Within = 0.4045 min = 1
          Between = 0.4262 avg = 1.7
          Overall = 0.4174 max = 4

          F(9,24211) = 668.84
          corr(u_i, Xb) = 0.2614 Prob > F = 0.0000

          (Std. err. adjusted for 24,212 clusters in newid)
          ------------------------------------------------------------------------------
          | Robust
          L.x | Coefficient std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          age |
          L1. | .0658998 .007089 9.30 0.000 .0520048 .0797947
          |
          age2 |
          L1. | .0037603 .0004887 7.70 0.000 .0028025 .0047181
          |
          inter1 |
          L1. | -.0500942 .0062661 -7.99 0.000 -.0623762 -.0378122
          |
          inter2 |
          L1. | -.0067139 .0005847 -11.48 0.000 -.00786 -.0055677
          |
          marital |
          L1. | .0092167 .0173765 0.53 0.596 -.0248424 .0432758
          |
          indicator |
          L1. | .1975562 .0129951 15.20 0.000 .172085 .2230274
          |
          wave |
          5 | .1437076 .0324503 4.43 0.000 .0801029 .2073123
          6 | .2064885 .0407266 5.07 0.000 .1266618 .2863151
          7 | .2718594 .0494435 5.50 0.000 .174947 .3687718
          |
          _cons | .4421824 .0358922 12.32 0.000 .3718315 .5125333
          -------------+----------------------------------------------------------------
          sigma_u | .36870559
          sigma_e | .25082651
          rho | .6836234 (fraction of variance due to u_i)
          ------------------------------------------------------------------------------

          .

          Comment


          • #6
            Dear All
            I find this error whenever i do Mean Group Estimation technique (it is panel data)
            xtdcce2 d.INF L.INF GOV GDP MS2 FPI IQI EXR INT, reportc cr(d.INF GOV GDP MS2 FPI IQI EXR INT) cr_lags(3)
            Units (id) to be removed due to insufficient numbers of observations: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1
            > 8 19 20 21 22 23 24 25 26 27

            No observations left.
            r(2001);

            Can anyone help me?
            Khethang

            Comment


            • #7
              Originally posted by Eve Liu View Post
              I am not sure the missing values in outcome you mentioned.
              You are correct. There is a logical error on my part in #4 as the estimation sample does take into account -if- conditions in the IV regression. I have looked at your output in #1 carefully and I now see what is happenning. As your endogenous variable is lagged, and you have a condition that restricts the sample, there is a difference in xtreg's estimation sample when using

              Code:
              xtreg ... if e(sample)
              and

              Code:
              keep if e(sample)
              xtreg ...
              xtivreg in the first stage uses the latter. First, it takes only observations in the estimation sample and then creates the lags before running xtreg. So

              Code:
              quietly xtivreg disease L1.age L1.age2 L1.inter1 L1.inter2 L1.marital (L1.x = L1.indicator) i.wave if L1.age>=55 & L1.age<=75 & labor_force==1, first fe vce(r)
              preserve
              keep if e(sample)
              xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave,fe vce(r)
              restore
              Last edited by Andrew Musau; 28 Oct 2023, 03:30.

              Comment


              • #8
                Originally posted by Khethang Mokoena View Post
                Dear All
                I find this error whenever i do Mean Group Estimation technique (it is panel data)
                xtdcce2 d.INF L.INF GOV GDP MS2 FPI IQI EXR INT, reportc cr(d.INF GOV GDP MS2 FPI IQI EXR INT) cr_lags(3)
                Units (id) to be removed due to insufficient numbers of observations: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1
                > 8 19 20 21 22 23 24 25 26 27

                No observations left.
                r(2001);

                Can anyone help me?
                Khethang
                Please start a new thread as your question differs from the topic addressed in this thread.

                Comment


                • #9
                  Originally posted by Andrew Musau View Post

                  You are correct. There is a logical error on my part in #4 as the estimation sample does take into account -if- conditions in the IV regression. I have looked at your output in #1 carefully and I now see what is happenning. As your endogenous variable is lagged, and you have a condition that restricts the sample, there is a difference in xtreg's estimation sample when using

                  Code:
                  xtreg ... if e(sample)
                  and

                  Code:
                  keep if e(sample)
                  xtreg ...
                  xtivreg in the first stage uses the latter. First, it takes only observations in the estimation sample and then creates the lags before running xtreg. So

                  Code:
                  quietly xtivreg disease L1.age L1.age2 L1.inter1 L1.inter2 L1.marital (L1.x = L1.indicator) i.wave if L1.age>=55 & L1.age<=75 & labor_force==1, first fe vce(r)
                  preserve
                  keep if e(sample)
                  xtreg L1.x L1.age L1.age2 L1.inter1 L1.inter2 L1.marital L1.indicator i.wave,fe vce(r)
                  restore
                  Dear Andrew, thanks for the reply. It works now.

                  Comment

                  Working...
                  X