Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graph of cluster specific regression lines after mixed looks like scribbles

    Dear Stata Experts,

    I would like to graph cluster-specific regression lines after estimating a 2-level model using the mixed command. I am using data where individuals are clustered within countries (from the International Social Survey Programme).
    I have followed the example at https://grodri.github.io/multilevel/lang2. The code below works nicely with the snijders data where students are clustered within schools.

    Code:
    use https://grodri.github.io/datasets/snijders, clear
    sum iq_verb
    gen iqvc = iq_verb - r(mean)
    mixed langpost iqvc || schoolnr: iqvc, mle covariance(unstructured)
    predict yhat2, fitted
    sort schoolnr iqvc
    line yhat2 iqvc, connect(ascending)
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	188.2 KB
ID:	1777150




    However, when I do this with my data, the graph I get looks like scribbles (see below). I think this may be an issue with sorting the data properly before graphing, but I haven't been able to fix it. Does anyone have any suggestions?

    Thanks,

    Jeremy

    Code:
    center tgid_7ns, gen(tgid_7ns_c)
    mixed jobsat c.tgid_7ns_c $personalcontrolsfv $jobcontrolsfv $countrycontrolsfv || country: c.tgid_7ns_c if analysis==1 & woman==1, stddev
    capture drop prjobsat
    predict prjobsat if e(sample), fitted
    sort country tgid_7ns_c
    line prjobsat tgid_7ns_c if e(sample), connect(ascending)
    Screenshot 2025-05-09 130036.png

    Last edited by Jeremy Reynolds; 09 May 2025, 11:12.

  • #2
    Hello again,

    I just figured out that the problem is connected to calculating predictions with the predict command when the multilevel regression contains multiple IVs.

    When there are multiple IVs in the regression, the code "predict yhat, fitted" calculates different predictions for each individual based on their values on all the IVs (as it should). This makes a messy graph.
    This was not a problem in the examples I found because all those regressions only had one IV.

    With multiple IVs, however, to effectively show the random intercept and slope for each cluster (as opposed to the predictions for every person), the predictions have to be calculated without incorporating individual-level differences (other than those on the focal IV).

    Below are two examples with the Snijders data. The first example shows the problem that occurs when using the "predict, fitted" approach when there are multiple IVs in the regression.
    The second example shows a solution drawing on code that Rodriguez posted at https://grodri.github.io/multilevel/lang2.

    Hopefully this will be useful for someone else too.

    Jeremy


    Code:
    use https://grodri.github.io/datasets/snijders, clear
    
    sum iq_verb
    gen iqvc = iq_verb - r(mean)
    
    mixed langpost iqvc sex || schoolnr: iqvc, mle covariance(unstructured)
    
    predict yhat, fitted
    
    sort schoolnr iqvc
    line yhat iqvc, connect(ascending)
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	106.9 KB
ID:	1777243

    Code:
    use https://grodri.github.io/datasets/snijders, clear
    
    sum iq_verb
    gen iqvc = iq_verb - r(mean)
    
    mixed langpost iqvc sex || schoolnr: iqvc, mle covariance(unstructured)
    
    *get random slope and intercept for each country
    predict r* if e(sample), reffects
    
    *calculate cluster-level predictions manually
    gen pr_langpost = (_b[_cons] + r2) + (_b[iqvc] + r1)*iqvc
    
    sort schoolnr iqvc
    line pr_langpost iqvc, connect(ascending)
    Click image for larger version

Name:	Graph2.png
Views:	1
Size:	98.1 KB
ID:	1777244


    Comment

    Working...
    X