I am using Stata 14.2. I have a survey and I want to see if the proportions within the crosstab of two ordinal variables are significantly different from the population. My two variables are age categories (25-29, 30-34, 35-39, and 40-44) and number of children (0, 1, 2, 3, 4, 5&6, 7 plus). I also want to incorporate the survey weights into the analysis.
My first search brought me to the csgof command, which allows me to specify my expected values, but it does not do the test on a crosstab of two variables, just proportions of one variable.
I ended up doing the svy: tab command for the age categories and number of kids, which allowed me to see the weighted frequencies within each cell. Then, I used the chisqi command to compare the population frequencies with the weighted sample frequencies. However, I had to do this separately for each age category. I know that doing separate tests for each age group is not as ideal as testing whether the entire distribution is significantly different from the population. Is there a simpler way to do this?
Here is the code that I used:
My first search brought me to the csgof command, which allows me to specify my expected values, but it does not do the test on a crosstab of two variables, just proportions of one variable.
I ended up doing the svy: tab command for the age categories and number of kids, which allowed me to see the weighted frequencies within each cell. Then, I used the chisqi command to compare the population frequencies with the weighted sample frequencies. However, I had to do this separately for each age category. I know that doing separate tests for each age group is not as ideal as testing whether the entire distribution is significantly different from the population. Is there a simpler way to do this?
Here is the code that I used:
Code:
svy: tab AGECAT BIOKIDCAT, row obs *create locals for each age group that shows the frequency of each no. kids category *population local NSFBpopfreq2529 4561 2290 1940 850 280 70 10 local NSFBpopfreq3034 2537 2103 2826 1475 511 183 28 local NSFBpopfreq3539 1975 1839 3699 1933 679 261 52 local NSFBpopfreq4044 2291 1898 3864 2078 719 337 56 *sample local NSFBsamfreq2529 384 263 233 109 21 10 1 local NSFBsamfreq3034 287 266 337 185 60 25 4 local NSFBsamfreq3539 247 234 352 205 89 34 5 local NSFBsamfreq4044 265 215 427 199 81 42 7 *create local for the labels local chisqlabels Zero One Two Three Four FiveSix SevenPlus *do a separate chi square test for each age group to compare the distribution of the sample to the population chisqi `NSFBpopfreq2529' \ `NSFBsamfreq2529', labels(`chisqlabels') nst(25-29) chisqi `NSFBpopfreq3034' \ `NSFBsamfreq3034', labels(`chisqlabels') nst(30-34) chisqi `NSFBpopfreq3539' \ `NSFBsamfreq3539', labels(`chisqlabels') nst(35-39) chisqi `NSFBpopfreq4044' \ `NSFBsamfreq4044', labels(`chisqlabels') nst(40-44)
Comment