Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression framework to test for regional differences relative to a global mean

    Dear all,
    I'm trying to find a simple way to test for regional differences relative to a global mean in a weighted least-squares regression. I could do a long series of ttests manually, but I have 50 different variables and 9 regions that I want to cover so I'm trying to figure out an efficient way.

    I have added the global value as a single observation in the dataset and labeled it as another region (reg_10), but when I make this the reference group, I get very wild standard error so this doesn't help with the hypothesis I'm trying to test. Though the coefficients become the deviation from the global mean, which is useful for the interpretation.

    Code:
     regress cohd reg_1 reg_2 reg_3 reg_4 reg_5 reg_6 reg_7 reg_8 reg_9 [aw = totalpop] 
    (sum of wgt is 7,458,472,316)
    
          Source |       SS           df       MS      Number of obs   =       156
    -------------+----------------------------------   F(9, 146)       =      5.04
           Model |  14.5924218         9   1.6213802   Prob > F        =    0.0000
        Residual |  46.9408614       146   .32151275   R-squared       =    0.2371
    -------------+----------------------------------   Adj R-squared   =    0.1901
           Total |  61.5332832       155  .396988924   Root MSE        =    .56702
    
    ------------------------------------------------------------------------------
            cohd | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           reg_1 |  -.3439758   3920.682    -0.00   1.000    -7748.967    7748.279
           reg_2 |  -.0443622   3920.682    -0.00   1.000    -7748.667    7748.579
           reg_3 |  -.0408434   3920.682    -0.00   1.000    -7748.664    7748.582
           reg_4 |   .1017258   3920.682     0.00   1.000    -7748.521    7748.725
           reg_5 |  -.1633197   3920.682    -0.00   1.000    -7748.786     7748.46
           reg_6 |  -.7005311   3920.682    -0.00   1.000    -7749.324    7747.922
           reg_7 |    .910758   3920.682     0.00   1.000    -7747.712    7749.534
           reg_8 |  -.2052761   3920.682    -0.00   1.000    -7748.828    7748.418
           reg_9 |   .0348995   3920.682     0.00   1.000    -7748.588    7748.658
           _cons |   3.326115   3920.682     0.00   0.999    -7745.297    7751.949
    ------------------------------------------------------------------------------
    If I eliminate the global value from the dataset then I have to choose one of the 9 regions as the reference group and then my hypothesis test is relative to that group, not the global mean. Ideally I'd like to figure out how to set this up so the constant is the global mean and then each region's coefficient is the deviation from global mean, and the hypothesis test of the difference from the regional mean to the global mean.

    The data have 50 variables (cohd in the example above), 9 regions, and the aggregation at the global level needs to be weighted (in this example, by the population - totalpop).

    Any ideas how to set this up? Also, there is unequal variance across regions, due in part to the unequal number of observations per region.

    Thanks so much,
    Kate
Working...
X