Applying inverse probability weights to random effects models in panel data

John Adler

Join Date: Apr 2017
Posts: 173

Applying inverse probability weights to random effects models in panel data

05 Mar 2018, 10:07

Dear all,

I have questionnaire data across three waves, year 0, year 5, and year 10. At max this data had 1124 mothers responding to a questionnaire on their health. I have harmonized a separate dataset on the local area unemployment of these women, by manually entering each womans local unemployment into the excel file that this questionnaire data comes from. I import this into Stata and would like to analyze this as panel data, so I do the following:

Code:


reshape long health_y current_county_y psum_unemployed_total_cont_y i.own_educatin_y i.binmartatus_y i.medical_card_y, i(id) j(year)
 
. reshape long health_y current_county_y binary_health_y /*has_questionnaire_y*/ bmi_y binbmi_overweight_y binbmi_underweight_y binbmi_obese_y ord_bmi_y own_education_
> y medical_card_y employment_y binary_employment_y maritalstatus_y binmartatus_y age_y ord_age_y psum_unemployed_total_cont_y, i(id) j(year)
(note: j = 0 5 10)
 
Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                     1787   ->    5361
Number of variables                1181   ->    1148
j variable (3 values)                     ->   year
xij variables:
         health_y0 health_y5 health_y10   ->   health_y
current_county_y0 current_county_y5 current_county_y10->current_county_y
binary_health_y0 binary_health_y5 binary_health_y10->binary_health_y
                  bmi_y0 bmi_y5 bmi_y10   ->   bmi_y
binbmi_overweight_y0 binbmi_overweight_y5 binbmi_overweight_y10->binbmi_overweight_y
binbmi_underweight_y0 binbmi_underweight_y5 binbmi_underweight_y10->binbmi_underweight_y
binbmi_obese_y0 binbmi_obese_y5 binbmi_obese_y10->binbmi_obese_y
      ord_bmi_y0 ord_bmi_y5 ord_bmi_y10   ->   ord_bmi_y
own_education_y0 own_education_y5 own_education_y10->own_education_y
medical_card_y0 medical_card_y5 medical_card_y10->medical_card_y
employment_y0 employment_y5 employment_y10->   employment_y
binary_employment_y0 binary_employment_y5 binary_employment_y10->binary_employment_y
maritalstatus_y0 maritalstatus_y5 maritalstatus_y10->maritalstatus_y
binmartatus_y0 binmartatus_y5 binmartatus_y10->binmartatus_y
                  age_y0 age_y5 age_y10   ->   age_y
      ord_age_y0 ord_age_y5 ord_age_y10   ->   ord_age_y
psum_unemployed_total_cont_y0 psum_unemployed_total_cont_y5 psum_unemployed_total_cont_y10->psum_unemployed_total_cont_y
-----------------------------------------------------------------------------
 
.
. xtset id year
       panel variable:  id (strongly balanced)
        time variable:  year, 0 to 10, but with gaps
                delta:  1 unit

I have each womans id, their county id (geographic area) that these women are living in, they are also nested in family groups for which I have a family group id, however, as I drop anyone else from the family group who isn’t a mother from the sample, each family group now only contains the mother.

In my analysis I tested for attrition by creating a variable equal to one if mothers had left the sample, based on having filled a questionnaire in wave 1 but not in wave 2 and wave 3:

Code:

. drop if gender==1
(1,980 observations deleted)
 

* Total attrition left sample:
 
 
. generate leftsamp=.
(3,381 missing values generated)

. replace leftsamp = 1 if has_y5_questionnaire == 0 & has_y10_questionnaire == 0 
(1,530 real changes made)

. replace leftsamp = 0 if has_y5_questionnaire == 1 | has_y10_questionnaire == 1 
(1,851 real changes made)


.
.
.

. tab leftsamp

   leftsamp |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |      1,851       54.75       54.75
          1 |      1,530       45.25      100.00
------------+-----------------------------------
      Total |      3,381      100.00

. tab has_y0_questionnaire 

has_y0_ques |
  tionnaire |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |          9        0.27        0.27
          1 |      3,372       99.73      100.00
------------+-----------------------------------
      Total |      3,381      100.00

. tab binary_health_y if leftsamp == 1


.

Following this I look at the differences between the sample stayers and the sample leavers, and whether this difference is significant:

Code:


 . . tab binary_health_y

binary_heal |
       th_y |      Freq.     Percent        Cum.
------------+-----------------------------------
        Bad |        595       28.19       28.19
       Good |      1,516       71.81      100.00
------------+-----------------------------------
      Total |      2,111      100.00

.
.
.
. tab binary_health_y if leftsamp == 1

binary_heal |
       th_y |      Freq.     Percent        Cum.
------------+-----------------------------------
        Bad |        185       37.22       37.22
       Good |        312       62.78      100.00
------------+-----------------------------------
      Total |        497      100.00

 
.
.
.
 tab binary_health_y leftsamp, column row nokey chi2 lrchi2 V exact gamma taub

binary_hea |       leftsamp
     lth_y |         0          1 |     Total
-----------+----------------------+----------
       Bad |       410        185 |       595 
           |     68.91      31.09 |    100.00 
           |     25.40      37.22 |     28.19 
-----------+----------------------+----------
      Good |     1,204        312 |     1,516 
           |     79.42      20.58 |    100.00 
           |     74.60      62.78 |     71.81 
-----------+----------------------+----------
     Total |     1,614        497 |     2,111 
           |     76.46      23.54 |    100.00 
           |    100.00     100.00 |    100.00 

          Pearson chi2(1) =  26.2308   Pr = 0.000
 likelihood-ratio chi2(1) =  25.2839   Pr = 0.000
               Cramér's V =  -0.1115
                    gamma =  -0.2704  ASE = 0.051
          Kendall's tau-b =  -0.1115  ASE = 0.023
           Fisher's exact =                 0.000
   1-sided Fisher's exact =                 0.000

.
.

Results suggest that health differs for leavers and stayers in the sample, and that there is a significant relationship between leaving the sample and health.

I obviously wanted to do something to deal with this attrition bias.

Searching the forums I followed the advice from this post to consider inverse probability of attrition weighting: https://www.statalist.org/forums/for...istrative-data

And followed the steps linked to here:

http://www.chronicpoverty.org/upload...N-revfinal.pdf

I cloned the health variable from earlier as cbinary_health and created a variable A (for attrition) that was equal to 1 if binary health in waves 2 and 3 was missing and 0 otherwise. I also generated a lagged health value, although I don’t know if I did this right as this is a study measured at years 0, 5 and 10, so maybe it needs to be lagged differently.

Code:

 
gen lcbinary_health_y0 = (cbinary_health_y0 +1)
 
. gen A=1 if cbinary_health_y5==.& cbinary_health_y10==.
(3,510 missing values generated)
 
.
. replace A=0 if A!=1
(3,510 real changes made)
 
.
. tab A
 
          A |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |      1,848       54.66       54.66
          1 |      1,533       45.34      100.00
------------+-----------------------------------
      Total |      3,381      100.00
 

. tab binary_health_y
 
binary_heal |
       th_y |      Freq.     Percent        Cum.
------------+-----------------------------------
        Bad |        595       28.19       28.19
       Good |      1,516       71.81      100.00
------------+-----------------------------------
      Total |      2,111      100.00

As guided by the attached document, I calculate a probit of those variables that I think may lead to attrition in health, these include education, bmi, medical card holding (a form of social health insurance), employment data, marital status, age, and the local area unemployment rate and lagged binary health.

Code:

 
 
**** BINARY HEALTH
 
* Calculate unrestricted attrition probit
 
* Binary health Attrition:
 
* Vars that might effect health
 
 
 
. xi: probit A cbmi_y0 i.cown_education_y0 i.cmedical_card_y0 i.cemployment_y0 i.cmaritalstatus_y0 cage_y0 cpsum_unemployed_total_cont_y0 lcbinary_health_y0, robust clus
> ter(current_county_y)
i.cown_ed~on_y0   _Icown_educ_1-6     (naturally coded; _Icown_educ_1 omitted)
i.cmedical_c~y0   _Icmedical__0-1     (naturally coded; _Icmedical__0 omitted)
i.cemploymen~y0   _Icemployme_1-8     (naturally coded; _Icemployme_1 omitted)
i.cmaritalst~y0   _Icmaritals_1-6     (naturally coded; _Icmaritals_1 omitted)
 
note: _Icemployme_5 != 0 predicts success perfectly
      _Icemployme_5 dropped and 6 obs not used
 
Iteration 0:   log pseudolikelihood =  -1600.524 
Iteration 1:   log pseudolikelihood = -1495.6619 
Iteration 2:   log pseudolikelihood = -1495.0598 
Iteration 3:   log pseudolikelihood = -1495.0051 
Iteration 4:   log pseudolikelihood = -1494.9972 
Iteration 5:   log pseudolikelihood = -1494.9961 
Iteration 6:   log pseudolikelihood = -1494.9959 
Iteration 7:   log pseudolikelihood = -1494.9959 
 
Probit regression                               Number of obs     =      2,376
                                                Wald chi2(19)     =    4469.68
                                                Prob > chi2       =     0.0000
Log pseudolikelihood = -1494.9959               Pseudo R2         =     0.0659
 
                                        (Std. Err. adjusted for 30 clusters in current_county_y)
------------------------------------------------------------------------------------------------
                               |               Robust
                             A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------------+----------------------------------------------------------------
                       cbmi_y0 |  -.0009359   .0110034    -0.09   0.932    -.0225023    .0206304
                 _Icown_educ_2 |  -3.405426    .638231    -5.34   0.000    -4.656336   -2.154517
                 _Icown_educ_3 |  -4.045781     .37698   -10.73   0.000    -4.784649   -3.306914
                 _Icown_educ_4 |  -4.026321   .3782838   -10.64   0.000    -4.767743   -3.284898
                 _Icown_educ_5 |  -3.972752   .4498496    -8.83   0.000    -4.854441   -3.091063
                 _Icown_educ_6 |  -4.165562   .3707544   -11.24   0.000    -4.892227   -3.438897
                 _Icmedical__1 |   .0902858   .1105417     0.82   0.414     -.126372    .3069436
                 _Icemployme_2 |   .0486977   .3572737     0.14   0.892    -.6515459    .7489412
                 _Icemployme_3 |  -.1855261   .4400484    -0.42   0.673    -1.048005    .6769528
                 _Icemployme_4 |  -.4687752   .2495238    -1.88   0.060    -.9578329    .0202825
                 _Icemployme_5 |          0  (omitted)
                 _Icemployme_7 |  -.2044312   .0910754    -2.24   0.025    -.3829358   -.0259267
                 _Icemployme_8 |  -.2936189   .2797146    -1.05   0.294    -.8418495    .2546116
                 _Icmaritals_2 |   .0404128    .214149     0.19   0.850    -.3793115    .4601371
                 _Icmaritals_4 |   .2333091   .6966846     0.33   0.738    -1.132168    1.598786
                 _Icmaritals_5 |   1.065994   .8066381     1.32   0.186    -.5149877    2.646976
                 _Icmaritals_6 |   .0826954   .1384134     0.60   0.550    -.1885898    .3539807
                       cage_y0 |  -.0436179   .0093682    -4.66   0.000    -.0619793   -.0252565
cpsum_unemployed_total_cont_y0 |   .0430288   .0257745     1.67   0.095    -.0074883    .0935459
            lcbinary_health_y0 |  -.2567768   .0923911    -2.78   0.005    -.4378599   -.0756936
                         _cons |   5.376694   .6043276     8.90   0.000     4.192233    6.561154
------------------------------------------------------------------------------------------------

.

Then I employ a Wald test for whether attrition is random on those variables that were significant in this probit

Code:

 
 
. test _Icown_educ_2 _Icown_educ_3 _Icown_educ_4 _Icown_educ_5 _Icown_educ_6 _Icemployme_2 _Icemployme_3 _Icemployme_4 _Icemployme_5 _Icemployme_7 _Icemployme_8 cage_y0
> lcbinary_health_y0
 
 ( 1)  [A]_Icown_educ_2 = 0
 ( 2)  [A]_Icown_educ_3 = 0
 ( 3)  [A]_Icown_educ_4 = 0
 ( 4)  [A]_Icown_educ_5 = 0
 ( 5)  [A]_Icown_educ_6 = 0
 ( 6)  [A]_Icemployme_2 = 0
 ( 7)  [A]_Icemployme_3 = 0
 ( 8)  [A]_Icemployme_4 = 0
 ( 9)  [A]o._Icemployme_5 = 0
 (10)  [A]_Icemployme_7 = 0
 (11)  [A]_Icemployme_8 = 0
 (12)  [A]cage_y0 = 0
 (13)  [A]lcbinary_health_y0 = 0
       Constraint 9 dropped
 
           chi2( 12) = 2513.77
         Prob > chi2 =    0.0000
 
 
. * Below we test if any of the above groups of variables are individually different from zero:
.
. test _Icemployme_2 _Icemployme_3 _Icemployme_4 _Icemployme_5 _Icemployme_7 _Icemployme_8 
 
 ( 1)  [A]_Icemployme_2 = 0
 ( 2)  [A]_Icemployme_3 = 0
 ( 3)  [A]_Icemployme_4 = 0
 ( 4)  [A]o._Icemployme_5 = 0
 ( 5)  [A]_Icemployme_7 = 0
 ( 6)  [A]_Icemployme_8 = 0
       Constraint 4 dropped
 
           chi2(  5) =    8.88
         Prob > chi2 =    0.1139
 
. test _Icown_educ_2 _Icown_educ_3 _Icown_educ_4 _Icown_educ_5 _Icown_educ_6
 
 ( 1)  [A]_Icown_educ_2 = 0
 ( 2)  [A]_Icown_educ_3 = 0
 ( 3)  [A]_Icown_educ_4 = 0
 ( 4)  [A]_Icown_educ_5 = 0
 ( 5)  [A]_Icown_educ_6 = 0
 
           chi2(  5) =  176.33
         Prob > chi2 =    0.0000
 
. test cage_y0
 
 ( 1)  [A]cage_y0 = 0
 
           chi2(  1) =   21.68
         Prob > chi2 =    0.0000
 
. test lcbinary_health_y0
 
 ( 1)  [A]lcbinary_health_y0 = 0
 
           chi2(  1) =    7.72
         Prob > chi2 =    0.0054

Results suggest that i.cown_education_y0 i.cemployment_y0 cage_y0 and lcbinary_health_y0 are significant predictors of attrition.

So when I calculate inverse probability weights below, I exclude the above as causing attrition.

Code:

 
. * Calculate inverse probability weights
 
 
* First do the regression with everything in from before
 
 
.
.
. xi: probit A cbmi_y0 i.cown_education_y0 i.cmedical_card_y0 i.cemployment_y0 i.cmaritalstatus_y0 cage_y0 cpsum_unemployed_total_cont_y0 lcbinary_health_y0, robust clus
> ter(current_county_y)
i.cown_ed~on_y0   _Icown_educ_1-6     (naturally coded; _Icown_educ_1 omitted)
i.cmedical_c~y0   _Icmedical__0-1     (naturally coded; _Icmedical__0 omitted)
i.cemploymen~y0   _Icemployme_1-8     (naturally coded; _Icemployme_1 omitted)
i.cmaritalst~y0   _Icmaritals_1-6     (naturally coded; _Icmaritals_1 omitted)
 
note: _Icemployme_5 != 0 predicts success perfectly
      _Icemployme_5 dropped and 6 obs not used
 
Iteration 0:   log pseudolikelihood =  -1600.524 
Iteration 1:   log pseudolikelihood = -1495.6619 
Iteration 2:   log pseudolikelihood = -1495.0598 
Iteration 3:   log pseudolikelihood = -1495.0051 
Iteration 4:   log pseudolikelihood = -1494.9972 
Iteration 5:   log pseudolikelihood = -1494.9961 
Iteration 6:   log pseudolikelihood = -1494.9959 
Iteration 7:   log pseudolikelihood = -1494.9959 
 
Probit regression                               Number of obs     =      2,376
                                                Wald chi2(19)     =    4435.22
                                                Prob > chi2       =     0.0000
Log pseudolikelihood = -1494.9959               Pseudo R2         =     0.0659
 
                                        (Std. Err. adjusted for 30 clusters in current_county_y)
------------------------------------------------------------------------------------------------
                               |               Robust
                             A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------------+----------------------------------------------------------------
                       cbmi_y0 |  -.0009359   .0110034    -0.09   0.932    -.0225023    .0206304
                 _Icown_educ_2 |  -3.405426   .6382683    -5.34   0.000    -4.656409   -2.154444
                 _Icown_educ_3 |  -4.045781   .3770116   -10.73   0.000    -4.784711   -3.306852
                 _Icown_educ_4 |  -4.026321   .3783153   -10.64   0.000    -4.767805   -3.284836
                 _Icown_educ_5 |  -3.972752   .4498364    -8.83   0.000    -4.854415   -3.091089
                 _Icown_educ_6 |  -4.165562   .3708327   -11.23   0.000    -4.892381   -3.438743
                 _Icmedical__1 |   .0902858   .1105417     0.82   0.414     -.126372    .3069436
                 _Icemployme_2 |   .0486977   .3572737     0.14   0.892    -.6515459    .7489412
                 _Icemployme_3 |  -.1855261   .4400484    -0.42   0.673    -1.048005    .6769528
                 _Icemployme_4 |  -.4687752   .2495238    -1.88   0.060    -.9578329    .0202825
                 _Icemployme_5 |          0  (omitted)
                 _Icemployme_7 |  -.2044312   .0910754    -2.24   0.025    -.3829358   -.0259267
                 _Icemployme_8 |  -.2936189   .2797146    -1.05   0.294    -.8418495    .2546116
                 _Icmaritals_2 |   .0404128    .214149     0.19   0.850    -.3793115    .4601371
                 _Icmaritals_4 |   .2333091   .6966846     0.33   0.738    -1.132168    1.598786
                 _Icmaritals_5 |   1.065994   .8066381     1.32   0.186    -.5149877    2.646976
                 _Icmaritals_6 |   .0826954   .1384134     0.60   0.550    -.1885898    .3539807
                       cage_y0 |  -.0436179   .0093682    -4.66   0.000    -.0619793   -.0252565
cpsum_unemployed_total_cont_y0 |   .0430288   .0257745     1.67   0.095    -.0074883    .0935459
            lcbinary_health_y0 |  -.2567768   .0923911    -2.78   0.005    -.4378599   -.0756936
                         _cons |   5.376694   .6043547     8.90   0.000      4.19218    6.561207
------------------------------------------------------------------------------------------------
 
.
.
. gen sample=e(sample)
 
. predict pxav
(option pr assumed; Pr(A))
(1005 missing values generated)
 
.
* Repeat this regression excluding those things that cause attrition:
.
.
. xi: probit A cbmi_y0 i.cmedical_card_y0 i.cmaritalstatus_y0 cpsum_unemployed_total_cont_y0, robust cluster(current_county_y)
i.cmedical_c~y0   _Icmedical__0-1     (naturally coded; _Icmedical__0 omitted)
i.cmaritalst~y0   _Icmaritals_1-6     (naturally coded; _Icmaritals_1 omitted)
 
Iteration 0:   log pseudolikelihood = -1796.0272 
Iteration 1:   log pseudolikelihood = -1730.1673 
Iteration 2:   log pseudolikelihood = -1729.9723 
Iteration 3:   log pseudolikelihood =  -1729.972 
Iteration 4:   log pseudolikelihood =  -1729.972 
 
Probit regression                               Number of obs     =      2,643
                                                Wald chi2(7)      =     301.88
                                                Prob > chi2       =     0.0000
Log pseudolikelihood =  -1729.972               Pseudo R2         =     0.0368
 
                                        (Std. Err. adjusted for 30 clusters in current_county_y)
------------------------------------------------------------------------------------------------
                               |               Robust
                             A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------------+----------------------------------------------------------------
                       cbmi_y0 |   .0013239   .0107957     0.12   0.902    -.0198353    .0224831
                 _Icmedical__1 |   .2700192   .1163751     2.32   0.020     .0419282    .4981101
                 _Icmaritals_2 |   .2895128   .1267742     2.28   0.022     .0410399    .5379856
                 _Icmaritals_4 |   .5160599   .4218635     1.22   0.221    -.3107773    1.342897
                 _Icmaritals_5 |   1.258782   .6736031     1.87   0.062    -.0614555     2.57902
                 _Icmaritals_6 |    .509274   .1101539     4.62   0.000     .2933764    .7251717
cpsum_unemployed_total_cont_y0 |   .0336633   .0306106     1.10   0.271    -.0263324     .093659
                         _cons |  -.6886433    .351835    -1.96   0.050    -1.378227    .0009407
------------------------------------------------------------------------------------------------
 
.
. predict pxres
(option pr assumed; Pr(A))
(738 missing values generated)
 
* After calculating the predicted probabilities from the restricted attrition probit, the inverse probability weights are calculated straightforwardly by taking the ratio of the restricted to unrestricted probabilities.
 
. gen attwght=pxres/pxav
(1,005 missing values generated)

When I initially did my analysis I used random effects models, clustering at the county level and in a linear probability model.

Following creating the I weights I would like to apply my weights to my random effects regressions in this panel data as follows:

I regress percentage unemployed and other variables on binary_health_y, which is the health across all waves of this panel data, i.e. the long health. The other variables included in this model are similarly those which have been changed from age_y0 age_y5 age_y10 to age_y as the data was changed from wide to long.

My analysis without the weights is fine, as you can see.

Code:

 
 
 
** Consumption regressions (without and with attrition weights)
 
. *without inverse probability weights
. xtreg binary_health_y psum_unemployed_total_cont_y i.own_education_y i.maritalstatus_y i.medical_card_y i.employment_y age_y if gender==0 & sample==1, re robust cluste
> r(current_county_y)
 
Random-effects GLS regression                   Number of obs     =      1,546
Group variable: id                              Number of groups  =        792
 
R-sq:                                           Obs per group:
     within  = 0.0375                                         min =          1
     between = 0.0871                                         avg =        2.0
     overall = 0.0753                                         max =          3
 
                                                Wald chi2(19)     =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .
 
                                                                     (Std. Err. adjusted for 30 clusters in current_county_y)
-----------------------------------------------------------------------------------------------------------------------------
                                                            |               Robust
                                            binary_health_y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
------------------------------------------------------------+----------------------------------------------------------------
                               psum_unemployed_total_cont_y |   .0027172   .0025494     1.07   0.287    -.0022797     .007714
                                                            |
                                            own_education_y |
                                  Primary school education  |   .3478043   .1273368     2.73   0.006     .0982288    .5973797
                                     Some secondary school  |   .6342217   .0702333     9.03   0.000     .4965669    .7718764
                              Complete secondary education  |   .6390553   .0491846    12.99   0.000     .5426552    .7354555
    Some third level education at college, university, RTC  |   .6315469   .0668039     9.45   0.000     .5006137      .76248
Complete third level education at college, university, RTC  |   .7517092   .0704716    10.67   0.000     .6135874    .8898309
                                                            |
                                            maritalstatus_y |
                                                Cohabiting  |  -.0624296   .0361448    -1.73   0.084     -.133272    .0084129
                                                 Separated  |  -.1085929   .1341099    -0.81   0.418    -.3714435    .1542578
                                                  Divorced  |  -.0742946   .1289035    -0.58   0.564    -.3269409    .1783516
                                                   Widowed  |  -.2019116   .1486542    -1.36   0.174    -.4932684    .0894452
                                      Single/Never married  |  -.0849537   .0381968    -2.22   0.026    -.1598181   -.0100893
                                                            |
                                             medical_card_y |
                                                       Yes  |  -.1122467   .0331333    -3.39   0.001    -.1771867   -.0473066
                                                            |
                                               employment_y |
                                                Unemployed  |  -.0217951   .0447618    -0.49   0.626    -.1095266    .0659364
  Unable to work owing to permanent sickness or disability  |   -.613174   .0479992   -12.77   0.000    -.7072507   -.5190973
                                         At school/student  |  -.1256232   .0587738    -2.14   0.033    -.2408176   -.0104288
                           Seeking work for the first time  |  -.1833912   .0404457    -4.53   0.000    -.2626634   -.1041191
                                                  Employed  |   -.016472   .0243922    -0.68   0.499    -.0642799    .0313359
                                             Self Employed  |   .0020492   .0499899     0.04   0.967    -.0959291    .1000276
                             Wholly retired from paid work  |   .0638361   .0266037     2.40   0.016     .0116938    .1159783
                                                            |
                                                      age_y |  -.0023342   .0024538    -0.95   0.341    -.0071435    .0024751
                                                      _cons |    .165668   .0709221     2.34   0.019     .0266633    .3046728
------------------------------------------------------------+----------------------------------------------------------------
                                                    sigma_u |  .26997013
                                                    sigma_e |  .34561966
                                                        rho |  .37893873   (fraction of variance due to u_i)
-----------------------------------------------------------------------------------------------------------------------------
 
.

But when I apply the weights I have following problem:

Code:

 
. *with inverse probability weights
. xtreg binary_health_y psum_unemployed_total_cont_y i.own_education_y i.maritalstatus_y i.medical_card_y i.employment_y age_y [pw=attwght] if gender==0 & sample==1, re
> robust cluster(current_county_y)
pweight not allowed with between-effects and random-effects models

My question is basically, what can I do? I would really like to stick to the random effects model with linear probability models in panel data, as this is what I’ve been working hard on for the past number of months, but is there another approach I should be taking here? Or another way I can make this work? GLLAMM had popped into my head but I don’t really know what implication this might hold here or how to apply it. I could really do with some advice.

Tags: missing, panel data, Random-effects, regression, syntax

John Adler

Join Date: Apr 2017
Posts: 173

06 Mar 2018, 11:14

To update on this,

I've decided a better approach may be a fixed effects model in a linear probability model, this allows me to make use of inverse probability weights and still employ panel data methods and not loose a huge amount of observations in a logit model due to variables that may not vary by a huge amount.

Code:

. *with inverse probability weights
. xtreg binary_health_y psum_unemployed_total_cont_y i.own_education_y i.maritalstatus_y i.medical_card_y i.employment_y age_y [pw=attwght] if gender==0 & sample==1, fe 
> robust cluster(current_county_y)
note: 2.own_education_y omitted because of collinearity
note: 3.own_education_y omitted because of collinearity
note: 4.own_education_y omitted because of collinearity
note: 5.own_education_y omitted because of collinearity
note: 6.own_education_y omitted because of collinearity

Fixed-effects (within) regression               Number of obs      =      1439
Group variable: id                              Number of groups   =       722

R-sq:  within  = 0.0555                         Obs per group: min =         1
       between = 0.0000                                        avg =       2.0
       overall = 0.0049                                        max =         3

                                                F(14,28)           =         .
corr(u_i, Xb)  = -0.2075                        Prob > F           =         .

                                                                     (Std. Err. adjusted for 29 clusters in current_county_y)
-----------------------------------------------------------------------------------------------------------------------------
                                                            |               Robust
                                            binary_health_y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------------------------------------------+----------------------------------------------------------------
                               psum_unemployed_total_cont_y |   .0073465   .0032407     2.27   0.031     .0007082    .0139849
                                                            |
                                            own_education_y |
                                  Primary school education  |          0  (omitted)
                                     Some secondary school  |          0  (omitted)
                              Complete secondary education  |          0  (omitted)
    Some third level education at college, university, RTC  |          0  (omitted)
Complete third level education at college, university, RTC  |          0  (omitted)
                                                            |
                                            maritalstatus_y |
                                                Cohabiting  |   .1169182   .0450686     2.59   0.015     .0245995     .209237
                                                 Separated  |   -.037324   .0626287    -0.60   0.556     -.165613     .090965
                                                  Divorced  |   .0092487   .1472833     0.06   0.950    -.2924474    .3109449
                                                   Widowed  |   -.261783   .1968107    -1.33   0.194    -.6649314    .1413654
                                      Single/Never married  |   .0152127   .0713941     0.21   0.833    -.1310316    .1614569
                                                            |
                                             medical_card_y |
                                                       Yes  |  -.0713404   .0465643    -1.53   0.137     -.166723    .0240422
                                                            |
                                               employment_y |
                                                Unemployed  |  -.0388137   .0760116    -0.51   0.614    -.1945165    .1168891
  Unable to work owing to permanent sickness or disability  |  -.6618086   .1202496    -5.50   0.000    -.9081288   -.4154885
                                         At school/student  |  -.1283945   .0981452    -1.31   0.201    -.3294359     .072647
                           Seeking work for the first time  |  -.0835546   .0404353    -2.07   0.048    -.1663826   -.0007266
                                                  Employed  |  -.0362627   .0333763    -1.09   0.287     -.104631    .0321055
                                             Self Employed  |  -.0343689   .0580827    -0.59   0.559    -.1533459     .084608
                             Wholly retired from paid work  |    .005686   .0272892     0.21   0.836    -.0502133    .0615853
                                                            |
                                                      age_y |  -.0124721   .0053686    -2.32   0.028    -.0234693    -.001475
                                                      _cons |   1.187451   .1812941     6.55   0.000     .8160865    1.558815
------------------------------------------------------------+----------------------------------------------------------------
                                                    sigma_u |  .40830239
                                                    sigma_e |  .33676672
                                                        rho |  .59513514   (fraction of variance due to u_i)
-----------------------------------------------------------------------------------------------------------------------------

This was informed by further trawling the forums, particularly by reading up on some of the links provided by @Richard Williams in his response to a similar problem here: https://www.statalist.org/forums/for...-panel-dataset.

However, I have reached a bit of a stumbling block in following the steps to calculating inverse probability weights linked to here: http://www.chronicpoverty.org/upload...N-revfinal.pdf it may seem like a very simple question, but when estimating the probit regressions, I don't know if the predictors that I am including should be predictors of attrition, or just general predictors of the variable of interest, likewise does it matter if they predict both?

I would be very grateful for an outside perspective

Inverse probability weights in Stata.pdf

Attached Files

Inverse probability weights in Stata.pdf (185.2 KB, 1 view)

Announcement

Applying inverse probability weights to random effects models in panel data

Comment