Time-dependent covariates. is a random intercept model enough?

Gianfranco Di Gennaro

Join Date: Oct 2020
Posts: 140

Time-dependent covariates. is a random intercept model enough?

13 Jun 2025, 13:46

Dear all, I have a dataset with data from different European countries, repeated in the year.
My outcome ("vaccin") is the percentage of vaccinated people.
Adjustment variables are the gross product ("gdpx"), the percentage of women with high level of education ("highfemale") and the level of trust in government ("trustingov").
My interest is to trace the trajectory of how the percentage of vaccinated people varies as each year passes starting from the moment of the beginning of the vaccination campaign ("yfromi", years from initial campaign).

I wonder if using a random intercept model gives response to my research question.

In the past, I have decomputed time-dependent covariates into their within- and between-variability for a similar analysis. But I would like to avoid complicating matters.

Code:

mixed vaccin highfemale gdpx trustingov i.yfromi || Country:
margin i.yfromi
marginsplot

Eventually I might consider investigating the need for an autoregressive correlation matrix.

I have data fron 15 countries. In the dataex below I only show you four.

I thank you in advance.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str51 Country float yfromi byte vaccin float gdpx double highfemale int trustingov
"Germany"  .  .  .26205033    .  .
"Germany"  .  .   .2664895    .  .
"Germany"  .  .   .2862479    .  .
"Germany"  .  .    .303689    . 49
"Germany"  .  .   .3156603    . 36
"Germany"  .  .  .30757955    . 40
"Germany"  3 27   .3249038    . 32
"Germany"  4 27  .34838015    . 32
"Germany"  5 27  .35571855    . 39
"Germany"  6 29  .37007475    . 38
"Germany"  7 31   .3882204    . 48
"Germany"  8 33   .3966575 21.3 38
"Germany"  9 37   .4246713 22.1 39
"Germany" 10 40   .4472591 22.6 59
"Germany" 11 43  .46653605 23.2 54
"Germany" 12 43    .492498   24 45
"Germany" 13 43   .4880368 25.4 61
"Germany" 14 47   .5202314 26.3 50
"Germany" 15 53  .56638926 26.5 49
"Germany" 16 54   .5763466 27.7 44
"Germany"  .  .          .   29 34
"Germany"  .  .          .    .  .
"Italy"    .  .  .17108352    .  .
"Italy"    .  .   .1753017    .  .
"Italy"    .  .  .18956916    .  .
"Italy"    .  .  .20041035    . 37
"Italy"    .  .   .2097399    . 15
"Italy"    .  .  .20536943    . 26
"Italy"    1 44   .2092363    . 25
"Italy"    2 52   .2184514    . 23
"Italy"    3 54   .2183805    . 11
"Italy"    4 55   .2198919    . 10
"Italy"    5 50     .22117    . 18
"Italy"    6 56   .2252253 17.6 16
"Italy"    7 53  .24336183 17.9 15
"Italy"    8 50  .25286356 19.1 17
"Italy"    9 40  .26096517 19.8 15
"Italy"   10 53   .2787073 20.1 30
"Italy"   11 31  .26412037 20.6 29
"Italy"   12 32  .29463336 20.8 38
"Italy"   13 39   .3317676 21.1 33
"Italy"   14 45   .3409903 22.2 36
"Italy"    .  .          .   23 33
"Italy"    .  .          .    .  .
"Serbia"   .  . .006505113    .  .
"Serbia"   .  .  .00699257    .  .
"Serbia"   .  . .007756601    .  .
"Serbia"   .  . .008625991    .  .
"Serbia"   .  . .009645332    .  .
"Serbia"   .  . .009539871    .  .
"Serbia"   0  2 .017489076 25.5 48
"Serbia"   1  3  .01904064 26.6 43
"Serbia"   .  .          . 28.6 49
"Serbia"   .  .          .    .  .
"Spain"    .  .  .11229786    .  .
"Spain"    .  .   .1206124    .  .
"Spain"    .  .  .13654529    .  .
"Spain"    .  .    .147156    . 52
"Spain"    .  .   .1532567    . 55
"Spain"    .  .  .14929554    . 29
"Spain"    3 64  .14815281    . 20
"Spain"    4 66  .14964943    . 21
"Spain"    5 71   .1490647    . 13
"Spain"    6 75   .1519935    .  9
"Spain"    7 73   .1568559    . 11
"Spain"    8 79  .16358323 34.6 14
"Spain"    9 78   .1747417 35.4 14
"Spain"   10 82   .1854711 35.8 18
"Spain"   11 76  .19187716 36.6 17
"Spain"   12 79  .20752287 37.9 25
"Spain"   13 79  .18625443   39 25
"Spain"   14 77   .2099468 39.4 22
"Spain"   15 81  .24116737 39.9 23
"Spain"   16 75  .25735554 40.4 28
"Spain"    .  .          . 40.7 27
"Spain"    .  .          .    .  .
end

Tags: panel data, rando interceptm, time dependent covariates

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

14 Jun 2025, 00:54

Gianfranco:
as your research question seems to focus on within pane variation, why not going -xtreg,fe- instead?:

Code:

encode Country, g( Country_num )
xtreg vaccin highfemale gdpx trustingov i.yfromi , fe
note: 16.yfromi omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =         29
Group variable: Country_num                     Number of groups  =          4

R-squared:                                      Obs per group:
     Within  = 0.8836                                         min =          2
     Between = 0.8523                                         avg =        7.2
     Overall = 0.6821                                         max =          9

                                                F(14, 11)         =       5.96
corr(u_i, Xb) = -0.9275                         Prob > F          =     0.0026

------------------------------------------------------------------------------
      vaccin | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
  highfemale |    6.77488   1.936685     3.50   0.005     2.512264     11.0375
        gdpx |    242.287   48.67189     4.98   0.000     135.1609    349.4131
  trustingov |   -.071712   .1892228    -0.38   0.712    -.4881887    .3447646
             |
      yfromi |
          1  |  -7.186852   5.265909    -1.36   0.200    -18.77704    4.403336
          6  |   91.97459   18.53344     4.96   0.000     51.18277    132.7664
          7  |   82.47617   17.56674     4.70   0.001     43.81204    121.1403
          8  |   66.55434    15.7383     4.23   0.001     31.91457    101.1941
          9  |   55.18508   13.98233     3.95   0.002     24.41017    85.95999
         10  |   55.95038   11.68448     4.79   0.001     30.23302    81.66775
         11  |   42.26282   10.67608     3.96   0.002     18.76493     65.7607
         12  |   32.76868   8.496611     3.86   0.003     14.06776    51.46959
         13  |   28.12067   6.742842     4.17   0.002     13.27977    42.96157
         14  |   19.84608   5.157656     3.85   0.003     8.494154      31.198
         15  |     11.426   3.935979     2.90   0.014     2.762972    20.08904
         16  |          0  (omitted)
             |
       _cons |  -240.8148   69.26923    -3.48   0.005    -393.2753   -88.35423
-------------+----------------------------------------------------------------
     sigma_u |  45.840571
     sigma_e |  3.4352204
         rho |  .99441559   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(3, 11) = 8.20                       Prob > F = 0.0038

.

OOT, your model may suffer form latent variable-led endogeneity, as individual ability was not measured. Put differently, other things being equal, smarter human beings are, on average, more likely to pursue higher education levels and to be in favour of vaccinations, being more informed anoput their pros and cons.

Kind regards,
Carlo
(Stata 19.0)

Comment

Gianfranco Di Gennaro

Join Date: Oct 2020

Posts: 140
#3

14 Jun 2025, 16:23

Dear Carlo,

thank you very much for your valuable feedback and for raising the endogeneiy issue.

Among other things, I tried implementing the Mundlak approach (essentially a middle ground between fixed and random effects) to incorporate between-country variation as well:

Code:

mixed vaccin i.yfromi trustingov highfemale gdpx mean_gdpx mean_highfemale mean_trustingov || Country:

Essentially, it gives me very similar results to the model you proposed:

Code:

xtreg vaccin highfemale gdpx trustingov i.yfromi, fe

However, when testing the means (Mundlak test):

Code:

test mean_highfemale mean_gdpx mean_trustingov

I get a p-value = 0.6281, indicating no evidence of correlation between random effects and predictors. This suggests the random effects model is appropriate and that, at least for time-invariant omitted variables (culture, healthcare system and so on), the endogeneity issue is mitigated.

Unfortunately, for other sources of endogeneity (unobserved time-varying variables like local information campaigns, etcetera..), I currently lack adequate instrumental variables. (I'd welcome any suggestions on how to address this aspect).

Anyway...I have a question for you:

when using an unadjusted model

Code:

xtreg i.yfromi, fe margins i.yfromi, atmeans marginsplot

the marginsplot for yfromi shows a plateau after 2-3 years (as if there's a "hard core" of people unwilling to vaccinate).

However, in the adjusted models (both yours and the Mundlak version) I see a slightly continuing upward trend.

How you would interpret these different patterns. What do they tell me?

Of course, my deepest gratitude for your inights and your time.

Gianfranco
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

15 Jun 2025, 00:42

Gianfranco:
1) if patients' age is available, is the predictor -c.age##c.age- informative? If it were, there may be a turning point somewhere in the probability of undergoing vaccination.
2) Endogeneity: a possible instrument for high level istructions is patient's proximity to academic institutions (Card D. Using Geographic Variation in College Proximity to Estimate the Return to Schooling. NBER. Working Paper 4483. DOI 10.3386/w4483 Issue Date October 1993). As usual, this istrument should pass the usual tests to be reliable.

Kind regards,
Carlo
(Stata 19.0)
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#5

15 Jun 2025, 01:49

Originally posted by Gianfranco Di Gennaro View Post

. . . when using an unadjusted model . . . the marginsplot for yfromi shows a plateau after 2-3 years (as if there's a "hard core" of people unwilling to vaccinate).

However, in the adjusted models . . . I see a slightly continuing upward trend.

How you would interpret these different patterns. What do they tell me?

At least in the four countries for which you show data, there doesn't seem to be much of a plateau except perhaps for Spain and that I'd be inclined to attribute to a ceiling effect. And as far as adjusting for these particular covariates, no consistent pattern of temporal covariation emerges in the time-course plots (first figure below). Do you see something obvious in the other eleven countries' data?

Looking at coarsened summaries (nationwise means), the only possibility hinted at from the scatter plots (second figure below) seems to be for the trust-in-government variable and that a negative association—counterintuitive if these are government-sponsored campaigns.

Do-file for the figures and its log file are attached. To prepare them I assumed that your data listing is in order of years relative to the start of each nation's vaccination campaign regardless of whether the rows' yfromi is missing-valued, and that the GDP values have been standardized across countries so that they are comparable.

Attached Files

Vaccine Uptake.smcl (9.0 KB, 1 view)

Vaccine Uptake.do (5.2 KB, 1 view)
1 like
Comment

Announcement

Time-dependent covariates. is a random intercept model enough?

Comment

Comment

Comment

Comment