I’m having difficulty with a mixed effects model I’m trying to run. I have a data set looking at exposure to Lassa fever, and infectious disease endemic to west Africa. I am interested in building a model that describes correlates of exposure to Lassa fever (i.e., what factors are significantly associated with a positive ELISA antibody test for Lassa fever (LASV) exposure). Our data suggests a high degree of clustering within households – if one person shows evidence of exposure to Lassa virus, the ‘secondary attack rate’ or likelihood that a second person in the house was also exposed was high. We expected this, and anticipated needing to use a hierarchical modeling structure in our multivariable modeling.
I had selected the Stata command mepoisson, with household (the variable hh_new in our data set) as the random effects variable. I ran that model, then out of curiosity, I ran the model without hh_new as a random effects variable, and got nearly the same outcome. I was surprised by this, and I assume I have miscoded my command, so that Stata doesn’t properly consider the hh_new variable as a random effects variable.
I am relatively new to this type of model, can you see any obvious errors here? Did I miss some option to properly tell Stata that household should be considered as a random effect?
Here’s my output:
. // Set up the hierarchical model with single independent variables as a fixed effect
. // household is random effects
. // using a Poisson link
. tab district lasv_pos , row
+----------------+
| Key |
|----------------|
| frequency |
| row percentage |
+----------------+
2.7 | LASV ELISA status
District | Negative Positive | Total
-----------+----------------------+----------
Kenema | 1,775 1,405 | 3,180
| 55.82 44.18 | 100.00
-----------+----------------------+----------
Port Loko | 975 70 | 1,045
| 93.30 6.70 | 100.00
-----------+----------------------+----------
Tonkolili | 3,081 837 | 3,918
| 78.64 21.36 | 100.00
-----------+----------------------+----------
Total | 5,831 2,312 | 8,143
| 71.61 28.39 | 100.00
. mepoisson lasv_pos district_ken district_ton || hh_new:, irr
Fitting fixed-effects model:
Iteration 0: log likelihood = -6537.6159
Iteration 1: log likelihood = -4975.6947
Iteration 2: log likelihood = -4940.9195
Iteration 3: log likelihood = -4940.8151
Iteration 4: log likelihood = -4940.815
Refining starting values:
Grid node 0: log likelihood = -5133.1424
Fitting full model:
Iteration 0: log likelihood = -5133.1424 (not concave)
Iteration 1: log likelihood = -5012.8385 (not concave)
Iteration 2: log likelihood = -4940.5265
Iteration 3: log likelihood = -4936.5665
Iteration 4: log likelihood = -4936.5525
Iteration 5: log likelihood = -4936.5525
Mixed-effects Poisson regression Number of obs = 8,143
Group variable: hh_new Number of groups = 831
Obs per group:
min = 1
avg = 9.8
max = 20
Integration method: mvaghermite Integration pts. = 7
Wald chi2(2) = 406.15
Log likelihood = -4936.5525 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lasv_pos | IRR Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
district_ken | 6.614448 .8310112 15.04 0.000 5.170735 8.461258
district_ton | 3.180607 .4058815 9.07 0.000 2.476778 4.084445
_cons | .0653874 .0080086 -22.27 0.000 .0514328 .0831282
-------------+----------------------------------------------------------------
hh_new |
var(_cons)| .0502412 .0190299 .0239139 .1055528
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to incidence-rate ratios.
Note: _cons estimates baseline incidence rate (conditional on zero random effects).
LR test vs. Poisson model: chibar2(01) = 8.53 Prob >= chibar2 = 0.0018
. mepoisson lasv_pos district_ken district_ton , irr /* removing the hierarchical structure changes very little. This is not expected. */
Iteration 0: log likelihood = -6537.6159
Iteration 1: log likelihood = -4975.6947
Iteration 2: log likelihood = -4940.9195
Iteration 3: log likelihood = -4940.8151
Iteration 4: log likelihood = -4940.815
Poisson regression Number of obs = 8,143
Wald chi2(2) = 454.05
Log likelihood = -4940.815 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lasv_pos | IRR Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
district_ken | 6.595794 .8077476 15.40 0.000 5.188305 8.385108
district_ton | 3.189179 .396799 9.32 0.000 2.499033 4.069919
_cons | .0669857 .0080063 -22.62 0.000 .0529961 .0846682
------------------------------------------------------------------------------
Note: _cons estimates baseline incidence rate (conditional on zero random effects).
Thank you,
Matt
I had selected the Stata command mepoisson, with household (the variable hh_new in our data set) as the random effects variable. I ran that model, then out of curiosity, I ran the model without hh_new as a random effects variable, and got nearly the same outcome. I was surprised by this, and I assume I have miscoded my command, so that Stata doesn’t properly consider the hh_new variable as a random effects variable.
I am relatively new to this type of model, can you see any obvious errors here? Did I miss some option to properly tell Stata that household should be considered as a random effect?
Here’s my output:
. // Set up the hierarchical model with single independent variables as a fixed effect
. // household is random effects
. // using a Poisson link
. tab district lasv_pos , row
+----------------+
| Key |
|----------------|
| frequency |
| row percentage |
+----------------+
2.7 | LASV ELISA status
District | Negative Positive | Total
-----------+----------------------+----------
Kenema | 1,775 1,405 | 3,180
| 55.82 44.18 | 100.00
-----------+----------------------+----------
Port Loko | 975 70 | 1,045
| 93.30 6.70 | 100.00
-----------+----------------------+----------
Tonkolili | 3,081 837 | 3,918
| 78.64 21.36 | 100.00
-----------+----------------------+----------
Total | 5,831 2,312 | 8,143
| 71.61 28.39 | 100.00
. mepoisson lasv_pos district_ken district_ton || hh_new:, irr
Fitting fixed-effects model:
Iteration 0: log likelihood = -6537.6159
Iteration 1: log likelihood = -4975.6947
Iteration 2: log likelihood = -4940.9195
Iteration 3: log likelihood = -4940.8151
Iteration 4: log likelihood = -4940.815
Refining starting values:
Grid node 0: log likelihood = -5133.1424
Fitting full model:
Iteration 0: log likelihood = -5133.1424 (not concave)
Iteration 1: log likelihood = -5012.8385 (not concave)
Iteration 2: log likelihood = -4940.5265
Iteration 3: log likelihood = -4936.5665
Iteration 4: log likelihood = -4936.5525
Iteration 5: log likelihood = -4936.5525
Mixed-effects Poisson regression Number of obs = 8,143
Group variable: hh_new Number of groups = 831
Obs per group:
min = 1
avg = 9.8
max = 20
Integration method: mvaghermite Integration pts. = 7
Wald chi2(2) = 406.15
Log likelihood = -4936.5525 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lasv_pos | IRR Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
district_ken | 6.614448 .8310112 15.04 0.000 5.170735 8.461258
district_ton | 3.180607 .4058815 9.07 0.000 2.476778 4.084445
_cons | .0653874 .0080086 -22.27 0.000 .0514328 .0831282
-------------+----------------------------------------------------------------
hh_new |
var(_cons)| .0502412 .0190299 .0239139 .1055528
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation to incidence-rate ratios.
Note: _cons estimates baseline incidence rate (conditional on zero random effects).
LR test vs. Poisson model: chibar2(01) = 8.53 Prob >= chibar2 = 0.0018
. mepoisson lasv_pos district_ken district_ton , irr /* removing the hierarchical structure changes very little. This is not expected. */
Iteration 0: log likelihood = -6537.6159
Iteration 1: log likelihood = -4975.6947
Iteration 2: log likelihood = -4940.9195
Iteration 3: log likelihood = -4940.8151
Iteration 4: log likelihood = -4940.815
Poisson regression Number of obs = 8,143
Wald chi2(2) = 454.05
Log likelihood = -4940.815 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lasv_pos | IRR Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
district_ken | 6.595794 .8077476 15.40 0.000 5.188305 8.385108
district_ton | 3.189179 .396799 9.32 0.000 2.499033 4.069919
_cons | .0669857 .0080063 -22.62 0.000 .0529961 .0846682
------------------------------------------------------------------------------
Note: _cons estimates baseline incidence rate (conditional on zero random effects).
Thank you,
Matt
Comment