Longitudinal MLM estimates by year

Jeremiah Jaggers

Join Date: Aug 2024

Posts: 4
#1

Longitudinal MLM estimates by year

26 Aug 2024, 12:28

I am trying to estimate a longitudinal model using the mixed command. I would like to get estimates by year, but so far I have only been able to get that information by using the residual structure.

mixed DV time || SUBJECT:, residuals(independent, by(time))

Using the above command, I cannot get the ICC. I get the following warning:

estat icc not allowed after random-effects models with residual structures other than the
default independent structure

How can I get estimates by year and ICC using the independent structure?
Tags: syntax
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

26 Aug 2024, 12:43

The ICC is not definable when you request time-specific residuals. The definition of the ICC is V_subject/(V_subject+V_residual). When you specify -residuals(independent, by(time))- V_residual no longer exists. Instead there is a separate V_t for each value t of the time variable. So the reason you are not getting the ICC is because it does not exist. The error message is somewhat unclear. It's not a matter of some rule enforced in the code of -estat icc-; it is mathematics itself that prevents the calculation of an ICC for this model.

I should also add that, although your -mixed- command is legal, it somewhat contradicts itself about the nature of the time variable. On the one hand, you are specifying time as a continuous variable and estimating a single coefficient for a linear relationship between DV and time. On the other hand, in -residuals(independent, by(time))- you are treating it as a discrete variable used to stratify the analysis of residuals. This can be an appropriate model if there is reason to suspect that DV and time are linearly related but with residual heteroscedasticity due to effects of time. Just be certain that this is what you have in mind, and not an error in your code.

What specific estimates do you want by year? Do you want year-specific coefficients? If so, you need to specify time as discrete in the fixed-effects part of the model: i.time, not naked time. Learn more about Stata's factor-variable notation at -help fvvarlist-.
1 like
Comment
Jeremiah Jaggers

Join Date: Aug 2024

Posts: 4
#3

26 Aug 2024, 13:03

Thank you for the thorough response! As a follow-up, since I am treating the relationship between time and DV as linear, and I do not expect any problems with residual heteroscedasticity, then it would be inappropriate to use the residual structure - am I understanding that correctly? I will try using i.time in the fixed effects.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#4

26 Aug 2024, 13:19

As a follow-up, since I am treating the relationship between time and DV as linear, and I do not expect any problems with residual heteroscedasticity, then it would be inappropriate to use the residual structure - am I understanding that correctly?

Well, inappropriate is perhaps too strong a word, as it really applies only to your specific situation. If there is no time-based heteroscedasticity, then using -residuals(independent, by(time))- will produce similar estimated residual variance for each level of time (with some small amount of sampling variation). It would increase calculation burden (whether you would notice that depends on the size of your data set) and does you no good in the absence of heteroscedasticity, but otherwise wouldn't be a problem. It only becomes problematic because it prevents you from calculating an ICC. If you didn't want to do that, it would merely be inefficient, not inappropriate.
1 like
Comment

Erik Ruzek

Join Date: Oct 2017
Posts: 423

26 Aug 2024, 14:31

Just adding to Clyde's informative responses, it is not uncommon in other modeling approaches, such as structural equation models, to treat time as continuous and allow for each time point to have a unique residual. The default in mixed is to treat the residual variance as constant, but you can certainly use model testing to determine if this is an appropriate model for your data. Using the pig weight data, here is how you would do that:

Code:

webuse pig, clear
* Constant residual variance
mixed weight week || id:
estimates store constant

* Heterogenous residual variance
mixed weight week || id: , residuals(independent, by(week))
estimates store hetero

* Test more complex heterogeneous variance model against simpler constant variance model
lrtest hetero constant, stats

For this data, at least, the heterogeneous residual variance model provides a superior fit to the data. Output of the lrtest:

Code:

Likelihood-ratio test                                 LR chi2(8)  =    115.64
(Assumption: constant nested in hetero)               Prob > chi2 =    0.0000

Note: The reported degrees of freedom assumes the null hypothesis is not on the boundary of the parameter space.
      If this is not true, then the reported test is conservative.

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |          N   ll(null)  ll(model)      df        AIC        BIC
-------------+---------------------------------------------------------------
      cons~t |        432          .  -1014.927       4   2037.854   2054.127
      hetero |        432          .  -957.1081      12   1938.216   1987.037
-----------------------------------------------------------------------------

I also wanted to note that if you decide to keep week continuous, you can get predictions of the unique week effects by utilizing margins. This will work no matter how you specify the residual variance structure:

Code:

margins, at(week = (1(1)9))

Last edited by Erik Ruzek; 26 Aug 2024, 14:38. Reason: Added webuse code

Announcement

Longitudinal MLM estimates by year

Comment

Comment

Comment

Comment