Recently I ran some simple bivariate tables using tabulate via the tab2 wrapper. I wanted to show the missing data values in the tables but compute the chi-square statistics based only on the valid entries. There doesn't seen to be an easy way to do that. If you specify the missing option along with chi2 Stata computes the statistic based on the whole table. Here's an example.
I figured that had to be a routine that would do this and so I went searching through Stata's rather confusing set of tabulation routines -- table, tabulate tabdisp, tabstat, even epitab. Each has its own list of options, but, so far as I can tell, none will let you do what Stata's main competitors do -- print a table showing missing values but computing statistics only on valid values. I looked though a few of the many user-written routines for tables and I don't see any that do this either.
It was easy enough to work around the problem. I just got the table with the missing option, reran it quietly without the missing option, asking for chi-square and then used the display command to print it although I was surprised to see that tabulate returns chi-square and the p value but not the df. Have I missed something here? It hardly seems worth trying to get this on a wishlist for Stata 16, but on the other hand, particularly for a new user, Stata's table routines are, in my opinion, confusing and unnecessarily difficult to navigate.
Code:
. tabulate rep78 foreign, chi2 Repair | Record | Car type 1978 | Domestic Foreign | Total -----------+----------------------+---------- 1 | 2 0 | 2 2 | 8 0 | 8 3 | 27 3 | 30 4 | 9 9 | 18 5 | 2 9 | 11 -----------+----------------------+---------- Total | 48 21 | 69 Pearson chi2(4) = 27.2640 Pr = 0.000 . tabulate rep78 foreign, chi2 miss Repair | Record | Car type 1978 | Domestic Foreign | Total -----------+----------------------+---------- 1 | 2 0 | 2 2 | 8 0 | 8 3 | 27 3 | 30 4 | 9 9 | 18 5 | 2 9 | 11 . | 4 1 | 5 -----------+----------------------+---------- Total | 52 22 | 74 Pearson chi2(5) = 27.8735 Pr = 0.000
I figured that had to be a routine that would do this and so I went searching through Stata's rather confusing set of tabulation routines -- table, tabulate tabdisp, tabstat, even epitab. Each has its own list of options, but, so far as I can tell, none will let you do what Stata's main competitors do -- print a table showing missing values but computing statistics only on valid values. I looked though a few of the many user-written routines for tables and I don't see any that do this either.
It was easy enough to work around the problem. I just got the table with the missing option, reran it quietly without the missing option, asking for chi-square and then used the display command to print it although I was surprised to see that tabulate returns chi-square and the p value but not the df. Have I missed something here? It hardly seems worth trying to get this on a wishlist for Stata 16, but on the other hand, particularly for a new user, Stata's table routines are, in my opinion, confusing and unnecessarily difficult to navigate.
Comment