Graph of cluster specific regression lines after mixed looks like scribbles

Jeremy Reynolds

Join Date: Aug 2014

Posts: 55
#1

Graph of cluster specific regression lines after mixed looks like scribbles

09 May 2025, 11:09

Dear Stata Experts,

I would like to graph cluster-specific regression lines after estimating a 2-level model using the mixed command. I am using data where individuals are clustered within countries (from the International Social Survey Programme).
I have followed the example at https://grodri.github.io/multilevel/lang2. The code below works nicely with the snijders data where students are clustered within schools.

Code:

use https://grodri.github.io/datasets/snijders, clear sum iq_verb gen iqvc = iq_verb - r(mean) mixed langpost iqvc || schoolnr: iqvc, mle covariance(unstructured) predict yhat2, fitted sort schoolnr iqvc line yhat2 iqvc, connect(ascending)

However, when I do this with my data, the graph I get looks like scribbles (see below). I think this may be an issue with sorting the data properly before graphing, but I haven't been able to fix it. Does anyone have any suggestions?

Thanks,

Jeremy

Code:

center tgid_7ns, gen(tgid_7ns_c) mixed jobsat c.tgid_7ns_c $personalcontrolsfv $jobcontrolsfv $countrycontrolsfv || country: c.tgid_7ns_c if analysis==1 & woman==1, stddev capture drop prjobsat predict prjobsat if e(sample), fitted sort country tgid_7ns_c line prjobsat tgid_7ns_c if e(sample), connect(ascending)

Last edited by Jeremy Reynolds; 09 May 2025, 11:12.
Tags: None
Jeremy Reynolds

Join Date: Aug 2014

Posts: 55
#2

12 May 2025, 12:14

Hello again,

I just figured out that the problem is connected to calculating predictions with the predict command when the multilevel regression contains multiple IVs.

When there are multiple IVs in the regression, the code "predict yhat, fitted" calculates different predictions for each individual based on their values on all the IVs (as it should). This makes a messy graph.
This was not a problem in the examples I found because all those regressions only had one IV.

With multiple IVs, however, to effectively show the random intercept and slope for each cluster (as opposed to the predictions for every person), the predictions have to be calculated without incorporating individual-level differences (other than those on the focal IV).

Below are two examples with the Snijders data. The first example shows the problem that occurs when using the "predict, fitted" approach when there are multiple IVs in the regression.
The second example shows a solution drawing on code that Rodriguez posted at https://grodri.github.io/multilevel/lang2.

Hopefully this will be useful for someone else too.

Jeremy

Code:

use https://grodri.github.io/datasets/snijders, clear sum iq_verb gen iqvc = iq_verb - r(mean) mixed langpost iqvc sex || schoolnr: iqvc, mle covariance(unstructured) predict yhat, fitted sort schoolnr iqvc line yhat iqvc, connect(ascending)

Code:

use https://grodri.github.io/datasets/snijders, clear sum iq_verb gen iqvc = iq_verb - r(mean) mixed langpost iqvc sex || schoolnr: iqvc, mle covariance(unstructured) *get random slope and intercept for each country predict r* if e(sample), reffects *calculate cluster-level predictions manually gen pr_langpost = (_b[_cons] + r2) + (_b[iqvc] + r1)*iqvc sort schoolnr iqvc line pr_langpost iqvc, connect(ascending)
Comment

Announcement

Graph of cluster specific regression lines after mixed looks like scribbles

Comment