ivreg2 partialing out for DHS cross sectional data for 2017-18

Chanda Moon

Join Date: Apr 2021

Posts: 39
#1

ivreg2 partialing out for DHS cross sectional data for 2017-18

30 Jun 2022, 02:37

Hello everyone.

I am using Stata 14. I employed 2SLS estimation strategy with IV approach. I am using 3 IVs (distance to flood at village/cluster level, slope of cluster and distance if household to water source) and clustering the clustered standard error at enumeration block/cluster level. Moreover, i am using district fixed effects.
The issue is when i use the district fix effects i get this big warning message. How to resolve this if i have to use the fix effects at district level. Other control variables did not create any issue. I am new to state and don't know much about the models and statistics.

ivreg2 child_haz (imp_sanitation = dist_wtrsource_min hubdist_flood1 slope) i.ch_age_mo child_female i.child_brthsiz ch_food_div head_age head_gender head_edu mo_age mo_educ mo_working woman_height rural hh_size unimp_water wtr_treat ch_feces_disp livestock i.wealthq i.district, cluster(v_cluster)

------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 175.058
(Kleibergen-Paap rk Wald F statistic): 132.605
Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 13.91
10% maximal IV relative bias 9.08
20% maximal IV relative bias 6.46
30% maximal IV relative bias 5.39
10% maximal IV size 22.30
15% maximal IV size 12.83
20% maximal IV size 9.54
25% maximal IV size 7.80
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Warning: estimated covariance matrix of moment conditions not of full rank.
overidentification statistic not reported, and standard errors and
model tests should be interpreted with caution.
Possible causes:
number of clusters insufficient to calculate robust covariance matrix
singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
partial option may address problem.

Attached Files

partialing out error.docx (109.3 KB, 1 view)
Tags: None

Andrew Musau

Join Date: Oct 2014
Posts: 10481

30 Jun 2022, 12:33

ivreg2 is from SSC (FAQ Advice #12).

The issue is when i use the district fix effects i get this big warning message.

xtset the data specifying district as the panel identifier to include district fixed effects.

Code:

xtset district
ivreg2 child_haz (imp_sanitation = dist_wtrsource_min hubdist_flood1 slope) i.ch_age_mo///
child_female i.child_brthsiz ch_food_div head_age head_gender head_edu mo_age ///
mo_educ mo_working woman_height rural hh_size unimp_water wtr_treat ch_feces_disp ///
livestock i.wealthq, cluster(v_cluster)

Alternatively, include the -partial- option as below:

Code:

ivreg2 child_haz (imp_sanitation = dist_wtrsource_min hubdist_flood1 slope) i.ch_age_mo///
child_female i.child_brthsiz ch_food_div head_age head_gender head_edu mo_age ///
mo_educ mo_working woman_height rural hh_size unimp_water wtr_treat ch_feces_disp ///
livestock i.wealthq i.district, cluster(v_cluster) partial(i.district)

A third option is to install ivreghdfe from SSC and absorb the district indicators:

Code:

ssc install ivreghdfe, replace
ivreghdfe  child_haz (imp_sanitation = dist_wtrsource_min hubdist_flood1 slope) ///
i.ch_age_mo child_female i.child_brthsiz ch_food_div head_age head_gender ///
head_edu mo_age mo_educ mo_working woman_height rural hh_size unimp_water ///
wtr_treat ch_feces_disp livestock i.wealthq, absorb(district) cluster(v_cluster)

Last edited by Andrew Musau; 30 Jun 2022, 12:47.

Comment

Chanda Moon

Join Date: Apr 2021

Posts: 39
#3

01 Jul 2022, 20:12

Thank you very much Andrew for your detailed solution and great support.
I am failing to understand why we use this absorb and partial.

i tried though partial option and R- square decreased from 0.65 to 0.17. same for RSS and TSS too. I that logical to define the strength of model.

I am not sure what is ivreg2 and ivreghdfe . and "xtset the data specifying district as the panel identifier to include district fixed effects" I could not get it as this is cross sectional data no panel.

Last edited by Chanda Moon; 01 Jul 2022, 21:07.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10481

02 Jul 2022, 05:11

Originally posted by Chanda Moon View Post

I am failing to understand why we use this absorb and partial.

What the warning message in #1 indicates is that you do not have enough degrees of freedom to explicitly estimate the parameters. Absorbing the district indicators or partialling out addresses this.

I am not sure what is ivreg2 and ivreghdfe

ivreg2 is the command that you are using and ivreghdfe is a similar command to ivreg2. Both are from SSC. Read the documentation of the commands once installed.

Code:

help ivreg2
help ivreghdfe

"xtset the data specifying district as the panel identifier to include district fixed effects" I could not get it as this is cross sectional data no panel.

True that you do not have panel data, but you can use a panel data command and forego including the district dummies in the regression. You do this by xtsetting your data using district as the identifier. The rationale is explained in Frisch and Waugh's Econometrica paper. Below, the census dataset is cross-sectional. Instead of including regional dummies using regress, I can xtset using region and use xtreg with the -fe- option to estimate the same model.

Code:

sysuse census, clear
regress pop marriage death i.region
xtset region
xtreg pop marriage death, fe

Res.:

Code:

. regress pop marriage death i.region

      Source |       SS           df       MS      Number of obs   =        50
-------------+----------------------------------   F(5, 44)        =    532.98
       Model |  1.0717e+15         5  2.1433e+14   Prob > F        =    0.0000
    Residual |  1.7694e+13        44  4.0214e+11   R-squared       =    0.9838
-------------+----------------------------------   Adj R-squared   =    0.9819
       Total |  1.0893e+15        49  2.2232e+13   Root MSE        =    6.3e+05

------------------------------------------------------------------------------
         pop |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    marriage |   15.36562   4.962127     3.10   0.003     5.365109    25.36613
       death |   97.87795     5.4753    17.88   0.000     86.84321    108.9127
             |
      region |
    N Cntrl  |   348467.4     287475     1.21   0.232    -230900.5    927835.2
      South  |   261687.3   288426.3     0.91   0.369    -319597.8    842972.4
       West  |   635699.1     308422     2.06   0.045     14115.36     1257283
             |
       _cons |  -411125.5   246198.3    -1.67   0.102    -907305.5    85054.51
------------------------------------------------------------------------------

. xtset region
       panel variable:  region (unbalanced)

. xtreg pop marriage death, fe

Fixed-effects (within) regression               Number of obs     =         50
Group variable: region                          Number of groups  =          4

R-sq:                                           Obs per group:
     within  = 0.9833                                         min =          9
     between = 0.9946                                         avg =       12.5
     overall = 0.9818                                         max =         16

                                                F(2,44)           =    1296.66
corr(u_i, Xb)  = -0.1991                        Prob > F          =     0.0000

------------------------------------------------------------------------------
         pop |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    marriage |   15.36562   4.962127     3.10   0.003     5.365109    25.36613
       death |   97.87795     5.4753    17.88   0.000     86.84321    108.9127
       _cons |  -78471.64   131674.9    -0.60   0.554    -343844.9    186901.6
-------------+----------------------------------------------------------------
     sigma_u |  262033.82
     sigma_e |  634143.64
         rho |  .14584057   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(3, 44) = 1.58                       Prob > F = 0.2072

.

Comment

Chanda Moon

Join Date: Apr 2021

Posts: 39
#5

03 Jul 2022, 10:06

Thank you very much for the detailed answer. I wonder about R- squared after partial/absorb it reduced from 0.62 to 0.17. Do I need to explain and justify this partial/ absob issue and reduced R-square in paper ? or Do I just report the coefficient/ results of ivreg2 /ivreghdfe.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10481
#6

04 Jul 2022, 00:32

Read #2: https://stats.stackexchange.com/ques...les-estimation
Comment
Chanda Moon

Join Date: Apr 2021

Posts: 39
#7

03 Aug 2022, 04:00

Originally posted by Chanda Moon View Post

Thank you very much Andrew for your detailed solution and great support.
I am failing to understand why we use this absorb and partial.

i tried though partial option and R- square decreased from 0.65 to 0.17. same for RSS and TSS too. I that logical to define the strength of model.

I am not sure what is ivreg2 and ivreghdfe . and "xtset the data specifying district as the panel identifier to include district fixed effects" I could not get it as this is cross sectional data no panel.

Thank you very much Andrew for your kind help. I added partial option in model and sometimes i need to add 2 or more variables in partial. I am new to econometrics. I tried to read your referred paper above. But i am not sure how should i justify in my manuscript this partial option and resultant low R-square. I could not understand it in simple terms.

Your kind help would be highly appreciated. I would be thankful if i can progress.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10481
#8

03 Aug 2022, 05:29

Just do not report the R2 statistic. The following Stata FAQ is citable if anyone is interested in the omission: https://www.stata.com/support/faqs/s...least-squares/. You can either partial out or absorb the indicators, the reason being that you do not have enough degrees of freedom to explicitly estimate them. That does not need to be justified because one can observe the degrees of freedom and compare this with the number of indicators.
Comment
Chanda Moon

Join Date: Apr 2021

Posts: 39
#9

03 Aug 2022, 06:25

Thank you very much @ Andrew. My main concern is do i need to mention in the paper about partialing of district variable with some evidence paper as Frisch and Waugh's Econometrica paper. OR i donot need to mention at all.
2ndly, i have to report R2 statistic as i am asked to report. So what is the baest way to support this model.
Comment
Chanda Moon

Join Date: Apr 2021

Posts: 39
#10

03 Aug 2022, 06:29

Moreover when partial option is used, the constant are also disappeared. How can we tackle this issue in strong way while explaining as we have to report result table.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10481
#11

03 Aug 2022, 07:14

On the constant, partialling out indicators or absorbing them makes your model a fixed effects (FE) model, as I explained in #2. Intercepts are meaningless in FE models, see https://www.stata.com/support/faqs/s...effects-model/. On #9, you don't have to reference Frisch and Waugh, their theorem is well known. I have said all I need to say about the R2 statistic, if someone is asking you to include it, enlighten them why it is not a good idea. But my advice is optional, you do not have to follow it.
Comment
Chanda Moon

Join Date: Apr 2021

Posts: 39
#12

03 Aug 2022, 08:39

Thank you for the guidance about how to deal with paper.
Comment

Announcement