Dear all,
I'm trying to find a simple way to test for regional differences relative to a global mean in a weighted least-squares regression. I could do a long series of ttests manually, but I have 50 different variables and 9 regions that I want to cover so I'm trying to figure out an efficient way.
I have added the global value as a single observation in the dataset and labeled it as another region (reg_10), but when I make this the reference group, I get very wild standard error so this doesn't help with the hypothesis I'm trying to test. Though the coefficients become the deviation from the global mean, which is useful for the interpretation.
If I eliminate the global value from the dataset then I have to choose one of the 9 regions as the reference group and then my hypothesis test is relative to that group, not the global mean. Ideally I'd like to figure out how to set this up so the constant is the global mean and then each region's coefficient is the deviation from global mean, and the hypothesis test of the difference from the regional mean to the global mean.
The data have 50 variables (cohd in the example above), 9 regions, and the aggregation at the global level needs to be weighted (in this example, by the population - totalpop).
Any ideas how to set this up? Also, there is unequal variance across regions, due in part to the unequal number of observations per region.
Thanks so much,
Kate
I'm trying to find a simple way to test for regional differences relative to a global mean in a weighted least-squares regression. I could do a long series of ttests manually, but I have 50 different variables and 9 regions that I want to cover so I'm trying to figure out an efficient way.
I have added the global value as a single observation in the dataset and labeled it as another region (reg_10), but when I make this the reference group, I get very wild standard error so this doesn't help with the hypothesis I'm trying to test. Though the coefficients become the deviation from the global mean, which is useful for the interpretation.
Code:
regress cohd reg_1 reg_2 reg_3 reg_4 reg_5 reg_6 reg_7 reg_8 reg_9 [aw = totalpop] (sum of wgt is 7,458,472,316) Source | SS df MS Number of obs = 156 -------------+---------------------------------- F(9, 146) = 5.04 Model | 14.5924218 9 1.6213802 Prob > F = 0.0000 Residual | 46.9408614 146 .32151275 R-squared = 0.2371 -------------+---------------------------------- Adj R-squared = 0.1901 Total | 61.5332832 155 .396988924 Root MSE = .56702 ------------------------------------------------------------------------------ cohd | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- reg_1 | -.3439758 3920.682 -0.00 1.000 -7748.967 7748.279 reg_2 | -.0443622 3920.682 -0.00 1.000 -7748.667 7748.579 reg_3 | -.0408434 3920.682 -0.00 1.000 -7748.664 7748.582 reg_4 | .1017258 3920.682 0.00 1.000 -7748.521 7748.725 reg_5 | -.1633197 3920.682 -0.00 1.000 -7748.786 7748.46 reg_6 | -.7005311 3920.682 -0.00 1.000 -7749.324 7747.922 reg_7 | .910758 3920.682 0.00 1.000 -7747.712 7749.534 reg_8 | -.2052761 3920.682 -0.00 1.000 -7748.828 7748.418 reg_9 | .0348995 3920.682 0.00 1.000 -7748.588 7748.658 _cons | 3.326115 3920.682 0.00 0.999 -7745.297 7751.949 ------------------------------------------------------------------------------
The data have 50 variables (cohd in the example above), 9 regions, and the aggregation at the global level needs to be weighted (in this example, by the population - totalpop).
Any ideas how to set this up? Also, there is unequal variance across regions, due in part to the unequal number of observations per region.
Thanks so much,
Kate