Chi-square goodness of fit ?

Tony K

Join Date: Sep 2014

Posts: 3
#1

Chi-square goodness of fit ?

18 Sep 2014, 10:51

Greetings,

I am new to STATA and I need help with adding categories to a variable. Here is what I have done so far.

I used ICD9 codes and extracted the observations that have the diagnosis that I need. For each of them I generated a new variable (CODE1, CODE 2 etc) that have the number of observations for that specific condition. Now I would like to compare these categories to see if there is a statistical difference by ethnicity across each diagnosis. I am trying to merge these CODE variables into a single variable that has all CODE listed as categories. I dont know how to proceed since using gen DXALL max(CODE 1, CODE 2 etc) just adds them together under 1 category.

My command for the test is this
csgof race, expperc (25 15 15 15 15 15)

Please let me know if there are any other details I should provide
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35708
#2

18 Sep 2014, 11:58

I don't understand all of what what you did, but it sounds like making a simple problem very hard. A chi-square test comes most often and most easily out of a cross-tabulation. See for example

Code:

. sysuse auto (1978 Automobile Data) . tab fore rep78, chi

You really shouldn't need to calculate frequencies yourself. csgof, which you should explain as coming from UCLA, is not for testing association, but for fitting a single distribution.

Please study FAQ Advice at http://www.statalist.org/forums/help, especially Sections 6, 12 and 18.
Comment
Tony K

Join Date: Sep 2014

Posts: 3
#3

18 Sep 2014, 14:19

Sorry I wasn't clear on my initial post and thank you for pointing me to the FAQ. I had to recode the variables because they were coded as strings. This is what generated label values for ICD9 codes that I need. I was using

Code:

tab CODE1 RACE, chi2

But this doesnt provide me with percentages so I thought I'd use csgof just to get %
Comment
Sarah Edgington

Join Date: Apr 2014

Posts: 284
#4

18 Sep 2014, 14:30

Look at the help for tabulate twoway to find out more information about displaying percentages.
You'll want the row, column, or cell option, depending on which percentages you're interested in.
Comment
Tony K

Join Date: Sep 2014

Posts: 3
#5

18 Sep 2014, 14:36

Thank you for your help Sarah. Can you please tell me which help section should I look at if I want to merge categories to make a single variable with all categories listed?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35708
#6

18 Sep 2014, 14:47

Being string is no barrier to the process I cited. The only difference is whether missings are included by default. Try

Code:

. sysuse auto (1978 Automobile Data) . tostring rep78 foreign, gen(s_rep78 s_foreign) s_rep78 generated as str1 s_foreign generated as str1 . tab s_*, chi

While percents may be of interest for description, they aren't part of chi-square testing, which here is based on frequencies.
Comment

Announcement

Chi-square goodness of fit ?

Comment

Comment

Comment

Comment

Comment