Overlapping clusters

Ben Littenberg

Join Date: Apr 2014

Posts: 13
#1

Overlapping clusters

13 Nov 2018, 11:31

Hello, Statalist:

I am collecting functional status data (using the PROMIS-29 survey) from a random sample of patients from 40 different practices. Each practice contributes ~75 patients. I also have some practice-level descriptors that are the opinions (our practice is good/poor/indifferent at this or that function using the PIP survey) of a convenience sample of practice staff. Each practice contributes about 4 opinions in numeric form. I am interested in the relationship between the the PIP scores (N=160) and the PROMIS scores (N=3000) adjusting for the clustering within practice (N=40). I am using Stata 15.1. I think a linear model is reasonable.

One thought is to aggregate the PIP scores within practice (perhaps using the mean or median of the practice-level PIP) as the predictor, the PROMIS as the outcome, and adjust for clustering within practice:

Code:

regress promis pip_median, cluster(practice)

but this assumes there is no variance in the PIP scores within practice.

Is there a better way to specify the model, perhaps hierarchically, so that the overlapping clusters are respected?

Thanks for your help!

Ben Littenberg
University of Vermont
Tags: None
David Benson

Join Date: Oct 2018

Posts: 489
#2

19 Nov 2018, 13:32

Hi Ben,

Welcome to Statalist!

So given your title I thought the issue would be that there was a hierarchy of practices owned / controlled by the same hospital? (I'm thinking of the classic case of classrooms in a school and you would want to cluster at both the classroom and school level).

After reading your post, it sounds like your concern is that you have 4 opinions per practice, and there may be disagreement between the raters over their self-reported proficiency. Personally, I would think of this as a form of measurement error--the practice has some "true" level of proficiency, and the various raters, given their experience within the practice, their experience with other practices, who they are thinking of as the reference group, and how hard of a grader they are, can lead to to a lack of inter-rater reliability.

In practice, however, it sounds like this is a 3- or 4- point scale, so there may not be much variation. (i.e. How many respondents rate their own practice as poor?) So, I would go with what you suggested and take the mean of the practice-level PIP (use median as a robustness check, I would be *really* surprised if it changed anything.) If you want to convince yourself, you could also run your regressions with PIP_min and PIP_max (using the min or max of the 4 scores). If you had more raters per practice, and they rated themselves on a 7 or 10-pt scale, it might be worthwhile to calculate some measure of the "spread" of the opinion of the raters rating the practice (i.e. PIP_variance), which would give you some measure of the uncertainty around the opinion.

In my mind, the most interesting part about your study would be matching how well practices rate themselves vs how good they are based on outcomes. I could imagine a Lake Wobegon effect where all of the practices believed they were above average. See the Wikipedia page on "Illusory superiority" for many examples. My favorite is the famous 1981 survey that found that 93% of American drivers considered themselves "above average" drivers. And, in a 1977 survey of faculty at University of Nebraska–Lincoln, 90% of faculty rated themselves as above average teachers (apparently 68% put themselves in the top 25%). (Both are cited in the Wikipedia article).

Given that you have ~75 patients per practice in your sample, You might also run it using fixed-effects using -xtreg-.

Code:

regress promis pip_mean, vce(cluster practice_id) regress promis pip_mean i.practice_id, vce(cluster practice_id) xtreg promis pip_mean, fe i(practice_id) vce(cluster practice_id)

Last edited by David Benson; 19 Nov 2018, 13:40.
2 likes
Comment

Ben Littenberg

Join Date: Apr 2014
Posts: 13

26 Nov 2018, 19:58

Thank you, David.

Your second interpretation is correct: its is a measurement error problem (or at least a variance issue). The scale has much more range, however, as it is made up of 30 5-point Likert questions. And, yes, we are very interested in seeing if the ratings are associated with the outcomes.

Looking at your code, the first line is just what I was planning to do (so good).

Code:

. reg soc_tscore total_mean, vce(cluster practice)

Linear regression                               Number of obs     =      1,705
                                                F(1, 27)          =       3.10
                                                Prob > F          =     0.0896
                                                R-squared         =     0.0045
                                                Root MSE          =     9.9216

                              (Std. Err. adjusted for 28 clusters in practice)
------------------------------------------------------------------------------
             |               Robust
  soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  total_mean |   .0477538   .0271175     1.76   0.090    -.0078866    .1033942
       _cons |   45.87862   1.626045    28.21   0.000     42.54226    49.21499
------------------------------------------------------------------------------

The second version adds an indicator variable for each practice, which is also the clustering variable. I have often wondered if I am violating an assumption when I let the same characteristic serve as both a covariate and a clustering factor. Is there any bad side effect from doing this?

Code:

. reg soc_tscore total_mean i.practice, vce(cluster practice) nofvlabel
note: 3899.practice omitted because of collinearity

Linear regression                               Number of obs     =      1,705
                                                F(0, 27)          =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0492
                                                Root MSE          =     9.7713

                              (Std. Err. adjusted for 28 clusters in practice)
------------------------------------------------------------------------------
             |               Robust
  soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  total_mean |   .0854458   7.83e-14  1.1e+12   0.000     .0854458    .0854458
             |
    practice |
       1356  |   -3.63802   4.73e-13 -7.7e+12   0.000     -3.63802    -3.63802
       1357  |  -2.154061   1.82e-12 -1.2e+12   0.000    -2.154061   -2.154061
       1358  |   -2.05219   2.23e-12 -9.2e+11   0.000     -2.05219    -2.05219
       1359  |  -1.893544   4.85e-13 -3.9e+12   0.000    -1.893544   -1.893544
       1360  |  -.2766991   8.83e-13 -3.1e+11   0.000    -.2766991   -.2766991
       1866  |  -1.686026   1.81e-12 -9.3e+11   0.000    -1.686026   -1.686026
       1867  |  -4.486688   2.97e-12 -1.5e+12   0.000    -4.486688   -4.486688
       1868  |  -4.405869   1.36e-12 -3.2e+12   0.000    -4.405869   -4.405869
       1869  |  -2.296521   2.22e-12 -1.0e+12   0.000    -2.296521   -2.296521
       2074  |    .942721   5.59e-13  1.7e+12   0.000      .942721     .942721
       2175  |  -.9318264   2.83e-12 -3.3e+11   0.000    -.9318264   -.9318264
       2176  |   1.613394   1.12e-12  1.4e+12   0.000     1.613394    1.613394
       2177  |   3.900384   9.92e-13  3.9e+12   0.000     3.900384    3.900384
       2178  |   .8307847   1.27e-12  6.6e+11   0.000     .8307847    .8307847
       2179  |   .7154492   6.50e-13  1.1e+12   0.000     .7154492    .7154492
       2280  |  -1.253113   1.78e-12 -7.0e+11   0.000    -1.253113   -1.253113
       2482  |  -1.258662   2.73e-13 -4.6e+12   0.000    -1.258662   -1.258662
       2684  |  -4.719069   1.59e-12 -3.0e+12   0.000    -4.719069   -4.719069
       2987  |   -.880823   3.53e-13 -2.5e+12   0.000     -.880823    -.880823
       3290  |   1.839087   7.51e-13  2.4e+12   0.000     1.839087    1.839087
       3391  |  -3.527661   1.14e-12 -3.1e+12   0.000    -3.527661   -3.527661
       3492  |   5.089015   2.86e-12  1.8e+12   0.000     5.089015    5.089015
       3593  |  -7.954374   1.67e-12 -4.8e+12   0.000    -7.954374   -7.954374
       3896  |  -3.607874   1.17e-12 -3.1e+12   0.000    -3.607874   -3.607874
       3897  |  -2.917258   2.75e-13 -1.1e+13   0.000    -2.917258   -2.917258
       3898  |  -3.651017   6.71e-13 -5.4e+12   0.000    -3.651017   -3.651017
       3899  |          0  (omitted)
             |
       _cons |   45.68639   3.53e-12  1.3e+13   0.000     45.68639    45.68639
------------------------------------------------------------------------------

I am surprised that the coefficient on the total PIP got bigger (0.05 to 0.09) and much more significant even while every one of the practices was significantly different from each other. This sure looks like a mis-specification of some sort. I'm even more confused by the output of the last line:

Code:

. xtreg soc_tscore total_mean, fe i(practice) vce(cluster practice)
note: total_mean omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =      1,705
Group variable: practice                        Number of groups  =         28

R-sq:                                           Obs per group:
     within  =      .                                         min =          4
     between = 0.2945                                         avg =       60.9
     overall =      .                                         max =        134

                                                F(0,27)           =          .
corr(u_i, Xb)  =      .                         Prob > F          =          .

                              (Std. Err. adjusted for 28 clusters in practice)
------------------------------------------------------------------------------
             |               Robust
  soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  total_mean |          0  (omitted)
       _cons |    48.6278   5.85e-10  8.3e+10   0.000      48.6278     48.6278
-------------+----------------------------------------------------------------
     sigma_u |  3.0260467
     sigma_e |  9.7712946
         rho |  .08751321   (fraction of variance due to u_i)
------------------------------------------------------------------------------

What gives?

Comment

David Benson

Join Date: Oct 2018

Posts: 489
#4

29 Nov 2018, 00:03

The second version adds an indicator variable for each practice, which is also the clustering variable. I have often wondered if I am violating an assumption when I let the same characteristic serve as both a covariate and a clustering factor. Is there any bad side effect from doing this?

No, this is very common. In fact, with ~75 observations per practice, if you *didn't* cluster your standard errors within a practice, I would ask you about it (as a reviewer or if you were presenting)

I am surprised that the coefficient on the total PIP got bigger (0.05 to 0.09) and much more significant even while every one of the practices was significantly different from each other.

This is my guess now, given that I don't know how soc_tscore or total_mean is scored:
I'm actually not surprised, mainly for the "everybody thinks they are above average" reasoning I listed in #1. One implication of that is that each practice is a poor predictor of how good they are or not, so the total_mean self-evaluation has a relative weak correlation with the functional status scores (soc_tscore). Hence it is sig at only p<.09 in the 1st model.

If you plot a simple scatterplot (scatter soc_tscore total_mean) I suspect you will find a relatively weak correlation. (In fact, you might plot it in Excel, because it is really easy to get the regression equation and the r2 overlaid on the graph. You can do it in Stata, but it is harder. See here)

In #2, once you take into account the (significant) practice-specific effects, the total_mean score becomes a far better predictor (i.e. almost double in magnitude, and the std errors are much, much smaller). I would also plot your scatter plot with just practice==3492 & 3593 (one of the best and one of the worst) and use the by(practice) so you can see the difference.

For model #3, I don't know why I didn't think of this, but since you don't really have panel data, when you ran "xtreg, fe" the total_mean was wiped out because it doesn't vary over time for the practice.

Three other quick questions for you:
1) How is soc_tscore scored? (And what is the range that shows up in your data?)

2) Up top you indicated that you had data from 40 different practices, but in your models you have 28 clusters, is that right? (You may have only elected to share a regression with a subset of your data and that is fine)

3) Also, I may have mis-interpreted you. Somehow I was thinking you had 4 respondents rating each practice, but that they only had 1-2 questions of interest rating the practice (i.e. "overall, how would you rate the quality of the practice"). But if each of the 4 answered 30 5-point Likert questions, are you taking the mean of all 4 * 30 questions? Or, have you selected the questions that are most relevant? If you have that much data, it might be useful to include some measure of rater variance.

Anyway, hope that helps!
--David
Comment
Ben Littenberg

Join Date: Apr 2014

Posts: 13
#5

29 Nov 2018, 07:20

1) soc_tscore is a weighted sum of 4 Likert items for each respondent/patient.
2) So far, I have data from only 28 clusters - I am still optimistic about the others coming through...
3) Each staff respondent answers the 30 questions which produce a total for the response. total_mean is the average of the ~4 total within each practice. How would I go about including a measure of rater variance at the staff within clinic level when the records are all at the patient within clinic level? This is the question that really motivated my query. Thanks for helping me get to clarity!
Comment

David Benson

Join Date: Oct 2018
Posts: 489

11 Dec 2018, 20:14

Ben Littenberg

So I'm not quite sure the best handle this.

I don’t do surveys, but in briefly looking at it, it looks like most survey-based research with multiple raters mainly want to show that inter-rater reliability is high, and so use Kappa or survey measures to show that that is the case.

So, articles discussing various ways to measure inter-rater reliability:

Multiple Raters in Survey-Based Operations Management Research: A Review and Tutorial gives a nice overview of the various measures, as well as describing their strengths & weaknesses
This article does as well (in a medical setting)
This set of slides give a nice overview
This Statalist topic discusses Fleiss kappa or ICC for interrater agreement (and the appropriate Stata command)

You actually want to do something else. My hypothesis would be that the practice_score would be less predictive as inter-rater-variance increased. Which I think would mean an interaction effect between total_mean and inter-rater-variance (as opposed to just adding inter-rater-variance as a control). (Hopefully someone else here can chime in or you could talk to others who do survey research). So I would expect that the coefficient on total_mean would become smaller, and the std errors would become larger once the interaction term has been added.

To implement this in Stata:

Code:

* I created some toy data for the ratings (I only did 5 questions, you've got 30)
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(practice rater) float score byte(q1 q2 q3 q4 q5)
1 1 4.4 4 5 5 4 4
1 2 2.4 2 3 2 2 3
1 3 3.6 5 3 5 4 1
1 4 4.4 4 4 5 4 5
2 1   3 2 1 3 4 5
2 2   3 3 2 4 5 1
2 3   3 3 5 1 2 4
2 4   3 1 2 3 5 4
3 1 4.8 5 5 5 5 4
3 2 2.4 3 2 2 3 2
3 3 2.2 2 2 2 2 3
3 4 2.6 2 2 2 3 4
4 1 2.2 3 3 2 2 1
4 2 4.4 4 4 4 5 5
4 3   3 3 4 1 2 5
4 4 3.2 5 2 3 5 1
end

Code:

* Creating 3 measures of rater spread
bysort practice: egen score_avg = mean(score)
bysort practice: egen score_sd = sd(score)
bysort practice: egen score_iqr = iqr(score)

* Just labeling the variables
label var score_avg "Avg of 4 rater scores per practice"
label var score_iqr "IQR of 4 rater scores per practice (p75 - p25)"
label var score_sd  "SD of 4 rater scores per practice"

* NOTE: score is the avg of the rater's score on q1 to q5 (same as "egen rater_avg = rowmean(q1-q5)")
* Also note: practice #2 & #3 both have avg==3. But in practice #2, all raters gave it a 3 (so sd==0), whereas prac#3 varied from 4.8 down to 2.2
. list, sepby(practice) noobs abbrev(12)

  +--------------------------------------------------------------------------------------+
  | practice   rater   score   q1   q2   q3   q4   q5   score_avg   score_sd   score_iqr |
  |--------------------------------------------------------------------------------------|
  |        1       1     4.4    4    5    5    4    4         3.7   .9451631         1.4 |
  |        1       2     2.4    2    3    2    2    3         3.7   .9451631         1.4 |
  |        1       3     3.6    5    3    5    4    1         3.7   .9451631         1.4 |
  |        1       4     4.4    4    4    5    4    5         3.7   .9451631         1.4 |
  |--------------------------------------------------------------------------------------|
  |        2       1       3    2    1    3    4    5           3          0           0 |
  |        2       2       3    3    2    4    5    1           3          0           0 |
  |        2       3       3    3    5    1    2    4           3          0           0 |
  |        2       4       3    1    2    3    5    4           3          0           0 |
  |--------------------------------------------------------------------------------------|
  |        3       1     4.8    5    5    5    5    4           3    1.21106         1.4 |
  |        3       2     2.4    3    2    2    3    2           3    1.21106         1.4 |
  |        3       3     2.2    2    2    2    2    3           3    1.21106         1.4 |
  |        3       4     2.6    2    2    2    3    4           3    1.21106         1.4 |
  |--------------------------------------------------------------------------------------|
  |        4       1     2.2    3    3    2    2    1         3.2   .9092121         1.2 |
  |        4       2     4.4    4    4    4    5    5         3.2   .9092121         1.2 |
  |        4       3       3    3    4    1    2    5         3.2   .9092121         1.2 |
  |        4       4     3.2    5    2    3    5    1         3.2   .9092121         1.2 |
  +--------------------------------------------------------------------------------------+


. tabstat score, stats(mean p25 median p75 iqr sd var min max) by( practice)

Summary for variables: score
     by categories of: practice (Practice)

practice |      mean       p25       p50       p75       iqr        sd  variance       min       max
---------+------------------------------------------------------------------------------------------
       1 |       3.7         3         4       4.4       1.4  .9451631  .8933333       2.4       4.4
       2 |         3         3         3         3         0         0         0         3         3
       3 |         3       2.3       2.5       3.7       1.4   1.21106  1.466667       2.2       4.8
       4 |       3.2       2.6       3.1       3.8       1.2  .9092121  .8266667       2.2       4.4
---------+------------------------------------------------------------------------------------------
   Total |     3.225       2.5         3         4       1.5  .8512736  .7246667       2.2       4.8
----------------------------------------------------------------------------------------------------


* You could then collapse it down and include score_sd or score_sqr in your regressions
* It probably wouldn't matter which because the two will be so highly correlated
. corr score_sd score_iqr
(obs=16)

             | score_sd score_~r
-------------+------------------
    score_sd |   1.0000
   score_iqr |   0.9786   1.0000



* To collapse it down to 1 obs per practice (so you can merge in and use with your regressions)
* NOTE: *SAVE* your data before this because collapse DELETES data!
preserve
collapse (mean) practice_avg = score (sd) practice_sd = score (iqr) practice_iqr = score, by(practice)
* can restore if desired

. list, noobs abbrev(12)

  +------------------------------------------------------+
  | practice   practice_avg   practice_sd   practice_iqr |
  |------------------------------------------------------|
  |        1            3.7       .945163            1.4 |
  |        2              3             0              0 |
  |        3              3       1.21106            1.4 |
  |        4            3.2       .909212            1.2 |
  +------------------------------------------------------+

The regression code would look something like this:

Code:

* I've done it for practice_sd; could also do for practice_iqr
reg soc_tscore practice_avg i.practice, vce(cluster practice)
reg soc_tscore practice_avg practice_sd i.practice, vce(cluster practice)
reg soc_tscore c.practice_avg c. practice_sd  c.practice_avg#c.practice_sd  i.practice, vce(cluster practice)
* In Stata’s factor notation c.practice_avg##c.practice_sd  is the same as “c.practice_avg c. practice_sd  c.practice_avg#c.practice_sd”
* See -help fvvarlist-

Comment

Ben Littenberg

Join Date: Apr 2014

Posts: 13
#7

13 Dec 2018, 13:02

q
Comment

Ben Littenberg

Join Date: Apr 2014
Posts: 13

13 Dec 2018, 13:11

David Benson Thanks again for your ideas. It seems to blow up when I add the interaction term.

Code:

. reg soc_tscore total_mean i.practice, vce(cluster practice) nofvlab
note: 3899.practice omitted because of collinearity

Linear regression                               Number of obs     =      1,806
                                                F(0, 27)          =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0454
                                                Root MSE          =     9.7995

                              (Std. Err. adjusted for 28 clusters in practice)
------------------------------------------------------------------------------
             |               Robust
  soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  total_mean |   .1040647   1.40e-14  7.4e+12   0.000     .1040647    .1040647
             |
    practice |
       1356  |  -3.678077   1.02e-13 -3.6e+13   0.000    -3.678077   -3.678077
       1357  |  -2.547678   3.50e-13 -7.3e+12   0.000    -2.547678   -2.547678
       1358  |  -2.578228   4.25e-13 -6.1e+12   0.000    -2.578228   -2.578228
       1359  |  -1.746636   5.80e-14 -3.0e+13   0.000    -1.746636   -1.746636
       1360  |   -.450906   1.81e-13 -2.5e+12   0.000     -.450906    -.450906
       1866  |  -2.060447   3.49e-13 -5.9e+12   0.000    -2.060447   -2.060447
       1867  |  -5.377343   5.56e-13 -9.7e+12   0.000    -5.377343   -5.377343
       1868  |  -4.897112   2.67e-13 -1.8e+13   0.000    -4.897112   -4.897112
       1869  |  -2.761332   4.22e-13 -6.5e+12   0.000    -2.761332   -2.761332
       2074  |     .16633   1.21e-13  1.4e+12   0.000       .16633      .16633
       2175  |  -1.345215   5.32e-13 -2.5e+12   0.000    -1.345215   -1.345215
       2176  |    1.40794   2.23e-13  6.3e+12   0.000      1.40794     1.40794
       2177  |    3.72534   2.01e-13  1.9e+13   0.000      3.72534     3.72534
       2178  |   .5887131   2.50e-13  2.4e+12   0.000     .5887131    .5887131
       2179  |   .6853404   8.64e-14  7.9e+12   0.000     .6853404    .6853404
       2280  |  -1.978135   3.42e-13 -5.8e+12   0.000    -1.978135   -1.978135
       2482  |  -1.206401   4.56e-14 -2.6e+13   0.000    -1.206401   -1.206401
       2684  |  -5.038409   3.08e-13 -1.6e+13   0.000    -5.038409   -5.038409
       2987  |  -1.136636   7.63e-14 -1.5e+13   0.000    -1.136636   -1.136636
       3290  |   1.456408   1.56e-13  9.3e+12   0.000     1.456408    1.456408
       3391  |  -4.107456   2.28e-13 -1.8e+13   0.000    -4.107456   -4.107456
       3492  |  -.0428019   5.37e-13 -8.0e+10   0.000    -.0428019   -.0428019
       3593  |  -7.159404   3.24e-13 -2.2e+13   0.000    -7.159404   -7.159404
       3896  |   -3.37344   1.79e-13 -1.9e+13   0.000     -3.37344    -3.37344
       3897  |  -2.434986   4.33e-14 -5.6e+13   0.000    -2.434986   -2.434986
       3898  |  -3.821044   9.03e-14 -4.2e+13   0.000    -3.821044   -3.821044
       3899  |          0  (omitted)
             |
       _cons |   44.79706   6.03e-13  7.4e+13   0.000     44.79706    44.79706
------------------------------------------------------------------------------

. reg soc_tscore total_mean total_sd i.practice, vce(cluster practice) nofvlab
note: 3898.practice omitted because of collinearity
note: 3899.practice omitted because of collinearity

Linear regression                               Number of obs     =      1,806
                                                F(0, 27)          =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0454
                                                Root MSE          =     9.7995

                              (Std. Err. adjusted for 28 clusters in practice)
------------------------------------------------------------------------------
             |               Robust
  soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  total_mean |  -.2425313   4.42e-14 -5.5e+12   0.000    -.2425313   -.2425313
    total_sd |  -.6160847   3.73e-14 -1.7e+13   0.000    -.6160847   -.6160847
             |
    practice |
       1356  |    6.53514   7.04e-13  9.3e+12   0.000      6.53514     6.53514
       1357  |   12.09988   1.39e-12  8.7e+12   0.000     12.09988    12.09988
       1358  |   9.044391   1.33e-12  6.8e+12   0.000     9.044391    9.044391
       1359  |   7.556618   4.19e-13  1.8e+13   0.000     7.556618    7.556618
       1360  |   12.26937   9.90e-13  1.2e+13   0.000     12.26937    12.26937
       1866  |   9.023379   1.18e-12  7.7e+12   0.000     9.023379    9.023379
       1867  |    12.9634   1.96e-12  6.6e+12   0.000      12.9634     12.9634
       1868  |  -.9025613   6.17e-13 -1.5e+12   0.000    -.9025613   -.9025613
       1869  |   8.284219   1.30e-12  6.4e+12   0.000     8.284219    8.284219
       2074  |   7.760429   5.77e-13  1.3e+13   0.000     7.760429    7.760429
       2175  |    19.1465   2.04e-12  9.4e+12   0.000      19.1465     19.1465
       2176  |   13.35044   1.01e-12  1.3e+13   0.000     13.35044    13.35044
       2177  |   16.30195   1.01e-12  1.6e+13   0.000     16.30195    16.30195
       2178  |   9.530781   8.82e-13  1.1e+13   0.000     9.530781    9.530781
       2179  |   1.458946   1.59e-13  9.2e+12   0.000     1.458946    1.458946
       2280  |   9.110557   1.16e-12  7.8e+12   0.000     9.110557    9.110557
       2482  |   .1296931   7.13e-14  1.8e+12   0.000     .1296931    .1296931
       2684  |   11.60157   1.44e-12  8.1e+12   0.000     11.60157    11.60157
       2987  |   4.248536   3.66e-13  1.2e+13   0.000     4.248536    4.248536
       3290  |    6.00781   4.59e-13  1.3e+13   0.000      6.00781     6.00781
       3391  |   7.871955   1.02e-12  7.7e+12   0.000     7.871955    7.871955
       3492  |   13.17939   1.62e-12  8.1e+12   0.000     13.17939    13.17939
       3593  |   10.93666   1.55e-12  7.1e+12   0.000     10.93666    10.93666
       3896  |   5.183875   1.78e-13  2.9e+13   0.000     5.183875    5.183875
       3897  |   2.631541   2.74e-13  9.6e+12   0.000     2.631541    2.631541
       3898  |          0  (omitted)
       3899  |          0  (omitted)
             |
       _cons |    63.5131   2.20e-12  2.9e+13   0.000      63.5131     63.5131
------------------------------------------------------------------------------

. reg soc_tscore c.total_mea##c.total_sd i.practice, vce(cluster practice) nofvlab
note: c.total_mean#c.total_sd omitted because of collinearity
note: 3898.practice omitted because of collinearity
note: 3899.practice omitted because of collinearity

Linear regression                               Number of obs     =      1,806
                                                F(0, 27)          =          .
                                                Prob > F          =          .
                                                R-squared         =     0.0454
                                                Root MSE          =     9.7995

                                         (Std. Err. adjusted for 28 clusters in practice)
-----------------------------------------------------------------------------------------
                        |               Robust
             soc_tscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------+----------------------------------------------------------------
             total_mean |  -.2425313   4.42e-14 -5.5e+12   0.000    -.2425313   -.2425313
               total_sd |  -.6160847   3.73e-14 -1.7e+13   0.000    -.6160847   -.6160847
                        |
c.total_mean#c.total_sd |          0  (omitted)
                        |
               practice |
                  1356  |    6.53514   7.04e-13  9.3e+12   0.000      6.53514     6.53514
                  1357  |   12.09988   1.39e-12  8.7e+12   0.000     12.09988    12.09988
                  1358  |   9.044391   1.33e-12  6.8e+12   0.000     9.044391    9.044391
                  1359  |   7.556618   4.19e-13  1.8e+13   0.000     7.556618    7.556618
                  1360  |   12.26937   9.90e-13  1.2e+13   0.000     12.26937    12.26937
                  1866  |   9.023379   1.18e-12  7.7e+12   0.000     9.023379    9.023379
                  1867  |    12.9634   1.96e-12  6.6e+12   0.000      12.9634     12.9634
                  1868  |  -.9025613   6.17e-13 -1.5e+12   0.000    -.9025613   -.9025613
                  1869  |   8.284219   1.30e-12  6.4e+12   0.000     8.284219    8.284219
                  2074  |   7.760429   5.77e-13  1.3e+13   0.000     7.760429    7.760429
                  2175  |    19.1465   2.04e-12  9.4e+12   0.000      19.1465     19.1465
                  2176  |   13.35044   1.01e-12  1.3e+13   0.000     13.35044    13.35044
                  2177  |   16.30195   1.01e-12  1.6e+13   0.000     16.30195    16.30195
                  2178  |   9.530781   8.82e-13  1.1e+13   0.000     9.530781    9.530781
                  2179  |   1.458946   1.59e-13  9.2e+12   0.000     1.458946    1.458946
                  2280  |   9.110557   1.16e-12  7.8e+12   0.000     9.110557    9.110557
                  2482  |   .1296931   7.13e-14  1.8e+12   0.000     .1296931    .1296931
                  2684  |   11.60157   1.44e-12  8.1e+12   0.000     11.60157    11.60157
                  2987  |   4.248536   3.66e-13  1.2e+13   0.000     4.248536    4.248536
                  3290  |    6.00781   4.59e-13  1.3e+13   0.000      6.00781     6.00781
                  3391  |   7.871955   1.02e-12  7.7e+12   0.000     7.871955    7.871955
                  3492  |   13.17939   1.62e-12  8.1e+12   0.000     13.17939    13.17939
                  3593  |   10.93666   1.55e-12  7.1e+12   0.000     10.93666    10.93666
                  3896  |   5.183875   1.78e-13  2.9e+13   0.000     5.183875    5.183875
                  3897  |   2.631541   2.74e-13  9.6e+12   0.000     2.631541    2.631541
                  3898  |          0  (omitted)
                  3899  |          0  (omitted)
                        |
                  _cons |    63.5131   2.20e-12  2.9e+13   0.000      63.5131     63.5131
----------------------------------------------------------------------------------------

Same thing happens if I use the range instead of the sd. (With only 4 responses per practice, the IQR seems a bit silly.)

The tiny standard errors seem unreasonable to me.

What does the failure to calculate the interaction term mean?

Announcement