How to run chi-squared test on imputed data?

Joerg Eulenberger

Join Date: Aug 2015

Posts: 25
#1

How to run chi-squared test on imputed data?

10 Aug 2018, 08:57

Dear Statalisters,
i want to run a chi-squared test on two categorical variables in a multiply imputed dataset like:
tab var1 var2, chi
Any ideas? Thanks a lot,

Jörg
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

14 Aug 2018, 10:36

You'll need to use mi estimate or, maybe, svy: mi estimate. Unfortunately, tabulate isn't a supported mi estimation command. I can't suggest an alternative because you haven't provided enough information. So run plain tabulate (or svy: tabulate, if you have survey data) and show us the results. Include commands and results between code delimiters: [CODE] and [/CODE]. These are described in FAQ 12, which I suggest you reread before doing anything else.

Last edited by Steve Samuels; 14 Aug 2018, 10:40.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Joerg Eulenberger

Join Date: Aug 2015

Posts: 25
#3

20 Sep 2018, 01:17

Dear Steve Samuels,
Sorry for the incomplete information. I am using mi estimate in Stata 15.1.

Code:

webuse mhouses1993s30 mi xeq 5: tab nfeatures corner , chi

now I want to do the same with the complete mi-Dataset. May be the

Code:

mi estimate: mlogit nfeatures i.corner

is the solution. But there is the F-value reported and not the chi-value.

Thanks a lot,
Jörg
Comment

Joerg Eulenberger

Join Date: Aug 2015
Posts: 25

20 Sep 2018, 01:55

The results:

Code:

. webuse mhouses1993s30
(Albuquerque Home Prices Feb15-Apr30, 1993)

. mi xeq 5: tab nfeatures corner , chi

m=5 data:
-> tab nfeatures corner , chi

 Number of |
   certain |    Corner location
  features |         0          1 |     Total
-----------+----------------------+----------
         0 |         2          0 |         2
         1 |         6          1 |         7
         2 |        12          4 |        16
         3 |        22          6 |        28
         4 |        36          7 |        43
         5 |         7          4 |        11
         6 |         8          0 |         8
         7 |         1          0 |         1
         8 |         1          0 |         1
-----------+----------------------+----------
     Total |        95         22 |       117

          Pearson chi2(8) =   5,8024   Pr = 0,669

. mi estimate: mlogit nfeatures  i.corner

Multiple-imputation estimates                   Imputations       =         30
Multinomial logistic regression                 Number of obs     =        117
                                                Average RVI       =     0,0000
                                                Largest FMI       =     0,0000
DF adjustment:   Large sample                   DF:     min       =   6,89e+62
                                                        avg       =   7,51e+63
                                                        max       =          .
Model F test:       Equal FMI                   F(   8,      .)   =       0,30
Within VCE type:          OIM                   Prob > F          =     0,9664

------------------------------------------------------------------------------
   nfeatures |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
0            |
    1.corner |  -14,70108   2497,006    -0,01   0,995    -4908,743    4879,341
       _cons |  -2,890307   ,7264662    -3,98   0,000    -4,314154   -1,466459
-------------+----------------------------------------------------------------
1            |
    1.corner |  -,1540905   1,156501    -0,13   0,894    -2,420791     2,11261
       _cons |  -1,791757   ,4409614    -4,06   0,000    -2,656025   -,9274884
-------------+----------------------------------------------------------------
2            |
    1.corner |   ,5390828   ,7099531     0,76   0,448    -,8523997    1,930565
       _cons |  -1,098605    ,333335    -3,30   0,001     -1,75193   -,4452808
-------------+----------------------------------------------------------------
3            |
    1.corner |   ,3383234   ,6187297     0,55   0,585    -,8743644    1,551011
       _cons |  -,4924763   ,2706167    -1,82   0,069    -1,022875    ,0379227
-------------+----------------------------------------------------------------
4            |  (base outcome)
-------------+----------------------------------------------------------------
5            |
    1.corner |   1,079032   ,7505277     1,44   0,151    -,3919757    2,550039
       _cons |  -1,637531   ,4130695    -3,96   0,000    -2,447132   -,8279298
-------------+----------------------------------------------------------------
6            |
    1.corner |  -14,70108   1248,503    -0,01   0,991    -2461,722     2432,32
       _cons |  -1,504012   ,3908605    -3,85   0,000    -2,270085   -,7379398
-------------+----------------------------------------------------------------
7            |
    1.corner |  -14,70108     3531,3    -0,00   0,997    -6935,922     6906,52
       _cons |  -3,583454   1,013769    -3,53   0,000    -5,570405   -1,596503
-------------+----------------------------------------------------------------
8            |
    1.corner |  -14,70108     3531,3    -0,00   0,997    -6935,922     6906,52
       _cons |  -3,583454   1,013769    -3,53   0,000    -5,570405   -1,596503
------------------------------------------------------------------------------

Comment

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

20 Sep 2018, 15:43

Thanks for the results, Joerg. Although you got a Chi square result for a single imputation data set, that result is correct only if variability introduced by the random imputation is ignored. Unfortunately, mi works only for estimation problems. There are no "parameters" for the Chi Square test except the probabilities themselves. Hence the need for mi estimate: mlogit. The reason that mi estimate reports a t-test (for single coefficient problems) or F-test (for multiple coefficient problems) is explained on page 26 of the documentation. As you can see, the theory for mi estimate with mi logit looks nothng like the theory for the ordinary Pearson Chi Square test.

The global F test for mi estimate can look weird because the degrees of freedom can get very large. What happens if you run this alternative, which tests the same hypothesis of no association?

Code:

test [0]1.corner= [1]1.corner =[2]1.corner = [3]1.corner = [4]1.corner = [5]1.corner = [6]1.corner = [7]1.corner = [8]1.corner

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#6

20 Sep 2018, 19:15

Correction: For mi logit , read mlogit: As you can see, the theory for mi estimate with mlogit looks nothng like the theory for the ordinary Pearson Chi Square test

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Joerg Eulenberger

Join Date: Aug 2015
Posts: 25

21 Sep 2018, 01:14

Thank you Steve,
The test code does not work with mi-data.
This is the output:

Code:

. mi estimate: mlogit nfeatures  i.corner

Multiple-imputation estimates                   Imputations       =         30
Multinomial logistic regression                 Number of obs     =        117
                                                Average RVI       =     0,0000
                                                Largest FMI       =     0,0000
DF adjustment:   Large sample                   DF:     min       =   6,89e+62
                                                        avg       =   7,51e+63
                                                        max       =          .
Model F test:       Equal FMI                   F(   8,      .)   =       0,30
Within VCE type:          OIM                   Prob &gt; F          =     0,9664

------------------------------------------------------------------------------
   nfeatures |      Coef.   Std. Err.      t    P&gt;|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
0            |
    1.corner |  -14,70108   2497,006    -0,01   0,995    -4908,743    4879,341
       _cons |  -2,890307   ,7264662    -3,98   0,000    -4,314154   -1,466459
-------------+----------------------------------------------------------------
1            |
    1.corner |  -,1540905   1,156501    -0,13   0,894    -2,420791     2,11261
       _cons |  -1,791757   ,4409614    -4,06   0,000    -2,656025   -,9274884
-------------+----------------------------------------------------------------
2            |
    1.corner |   ,5390828   ,7099531     0,76   0,448    -,8523997    1,930565
       _cons |  -1,098605    ,333335    -3,30   0,001     -1,75193   -,4452808
-------------+----------------------------------------------------------------
3            |
    1.corner |   ,3383234   ,6187297     0,55   0,585    -,8743644    1,551011
       _cons |  -,4924763   ,2706167    -1,82   0,069    -1,022875    ,0379227
-------------+----------------------------------------------------------------
4            |  (base outcome)
-------------+----------------------------------------------------------------
5            |
    1.corner |   1,079032   ,7505277     1,44   0,151    -,3919757    2,550039
       _cons |  -1,637531   ,4130695    -3,96   0,000    -2,447132   -,8279298
-------------+----------------------------------------------------------------
6            |
    1.corner |  -14,70108   1248,503    -0,01   0,991    -2461,722     2432,32
       _cons |  -1,504012   ,3908605    -3,85   0,000    -2,270085   -,7379398
-------------+----------------------------------------------------------------
7            |
    1.corner |  -14,70108     3531,3    -0,00   0,997    -6935,922     6906,52
       _cons |  -3,583454   1,013769    -3,53   0,000    -5,570405   -1,596503
-------------+----------------------------------------------------------------
8            |
    1.corner |  -14,70108     3531,3    -0,00   0,997    -6935,922     6906,52
       _cons |  -3,583454   1,013769    -3,53   0,000    -5,570405   -1,596503
------------------------------------------------------------------------------

. test [0]1.corner= [1]1.corner =[2]1.corner = [3]1.corner = [4]1.corner = [5]1.corner =  [6]1.corner = [7]1.corner = [8]1.corner
requested action not valid after most recent estimation command
r(321);

. mi test [0]1.corner= [1]1.corner =[2]1.corner = [3]1.corner = [4]1.corner = [5]1.corner =  [6]1.corner = [7]1.corner = [8]1.corner
=exp not allowed
r(101);

Basically, I want to report for a description a measures of association of two nominal Variables (mi-Data). Is there maybe another good way to do this with mi-Data?

Best regards,
Jörg

Comment

Steve Samuels

Join Date: Mar 2014

Posts: 1786
#8

21 Sep 2018, 09:33

Sorry, the correct syntax was:

Code:

mi test [0]corner [1]corner [2]corner [3]corner [4]corner [5]corner [6]corner [7]corner [8]corner

The result is identical to that of the F test.

However earlier today, I discovered this Stata FAQ by Isabel Cañette and Yulia Marchenko, entitled "Combining results other than coefficients in e(b) with multiply imputed data". It's unnecessary to go through mlogit. Choose one or more of the association measures from tabulate (e.g. Kendall's tau-B, Goodman & Krukal's gamma). Then follow Isabel and Yulia's directions.

Last edited by Steve Samuels; 21 Sep 2018, 09:49.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Joerg Eulenberger

Join Date: Aug 2015

Posts: 25
#9

24 Sep 2018, 09:23

Dear Steve,
many thanks for the tip. I will try it.
Best regards, Jörg
Comment

Announcement

How to run chi-squared test on imputed data?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment