Multiple chi squared on aggregated data

James Webb

Join Date: Sep 2016

Posts: 2
#1

Multiple chi squared on aggregated data

29 Sep 2016, 02:13

I have aggregated yes/no data for the number of individuals with and without a specific characteristic in two groups across a large number of variables.
Variable Yes_1 No_1 Yes_2 No_2 p

A 25 35 45 78

B 41 19 100 23

C 33 27 81 42

etc for multiple rows... I wish to compare homogeneity (i.e. are there equal proportions between groups 1 and 2 as represented by Yes_1 and No_1 and Yes_2 and No_2 respectively) between the two groups for each variable and would like to use the chi squared test. This would effectively be creating multiple 2x2 contingency tables, performing a chi square test on each and storing the output in a new column (like the empty p above). I can see that

Code:

tabi 25 35 \ 45 78, chi2

will do the simple maths that I want for variable A, but I can't see how to apply them to multiple pieces of aggregated data like this nor how to save the calculated probability in column p. Advice appreciated (Stata/MP 13.0)

Last edited by James Webb; 29 Sep 2016, 02:51. Reason: STATA --> Stata
Tags: chi2, chitesti, tabi
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

29 Sep 2016, 02:38

Cross-posted on Cross Validated http://stats.stackexchange.com/quest...uared-in-stata

Although you were (well) advised to post here (the post on CV is likely to be closed as off-topic) Statalist policy on cross-posting is explicit:

8. May I cross-post to other forums?

People posting on Statalist may also post the same question on other listservers or in web forums. There is absolutely no rule against doing that.

But if you do post elsewhere, we ask that you provide cross-references in URL form to searchable archives. That way, people interested in your question can quickly check what has been said elsewhere and avoid posting similar comments. Being open about cross-posting saves everyone time.

If your question was answered well elsewhere, please post a cross-reference to that answer on Statalist.

Much of the FAQ Advice also applies to your question, including #12 and #18. See http://www.statalist.org/forums/help

That said, this shows some technique:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str1 variable byte(yes_1 no_1 yes_2 no_2) "A" 25 35 45 78 "B" 41 19 100 23 "C" 33 27 81 42 end gen p = . quietly forval i = 1/`=_N' { tabi `=yes_1[`i']' `=no_1[`i']' \ `=yes_2[`i']' `=no_2[`i']', chi2 replace p = r(p) in `i' } list +---------------------------------------------------+ | variable yes_1 no_1 yes_2 no_2 p | |---------------------------------------------------| 1. | A 25 35 45 78 .5067174 | 2. | B 41 19 100 23 .050197 | 3. | C 33 27 81 42 .1549756 | +---------------------------------------------------+

See help macro and the documentation on tabi.
Comment
James Webb

Join Date: Sep 2016

Posts: 2
#3

29 Sep 2016, 03:05

Thank you

Your code does exactly what I wanted.
Comment

Variable	Yes_1	No_1	Yes_2	No_2	p
A	25	35	45	78
B	41	19	100	23
C	33	27	81	42

Announcement

Multiple chi squared on aggregated data

Comment

Comment