Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Chi-square goodness of fit ?

    Greetings,

    I am new to STATA and I need help with adding categories to a variable. Here is what I have done so far.

    I used ICD9 codes and extracted the observations that have the diagnosis that I need. For each of them I generated a new variable (CODE1, CODE 2 etc) that have the number of observations for that specific condition. Now I would like to compare these categories to see if there is a statistical difference by ethnicity across each diagnosis. I am trying to merge these CODE variables into a single variable that has all CODE listed as categories. I dont know how to proceed since using gen DXALL max(CODE 1, CODE 2 etc) just adds them together under 1 category.

    My command for the test is this
    csgof race, expperc (25 15 15 15 15 15)

    Please let me know if there are any other details I should provide

  • #2
    I don't understand all of what what you did, but it sounds like making a simple problem very hard. A chi-square test comes most often and most easily out of a cross-tabulation. See for example

    Code:
     
    . sysuse auto
    (1978 Automobile Data)
    
    . tab fore rep78, chi
    You really shouldn't need to calculate frequencies yourself. csgof, which you should explain as coming from UCLA, is not for testing association, but for fitting a single distribution.

    Please study FAQ Advice at http://www.statalist.org/forums/help, especially Sections 6, 12 and 18.

    Comment


    • #3
      Sorry I wasn't clear on my initial post and thank you for pointing me to the FAQ. I had to recode the variables because they were coded as strings. This is what generated label values for ICD9 codes that I need. I was using

      Code:
       
       tab CODE1 RACE, chi2
      But this doesnt provide me with percentages so I thought I'd use csgof just to get %

      Comment


      • #4
        Look at the help for tabulate twoway to find out more information about displaying percentages.
        You'll want the row, column, or cell option, depending on which percentages you're interested in.

        Comment


        • #5
          Thank you for your help Sarah. Can you please tell me which help section should I look at if I want to merge categories to make a single variable with all categories listed?

          Comment


          • #6
            Being string is no barrier to the process I cited. The only difference is whether missings are included by default. Try

            Code:
             
            . sysuse auto
            (1978 Automobile Data)
            
            . tostring rep78 foreign, gen(s_rep78 s_foreign)
            s_rep78 generated as str1
            s_foreign generated as str1
            
            . tab s_*, chi
            While percents may be of interest for description, they aren't part of chi-square testing, which here is based on frequencies.

            Comment

            Working...
            X