Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple chi squared on aggregated data

    I have aggregated yes/no data for the number of individuals with and without a specific characteristic in two groups across a large number of variables.
    Variable Yes_1 No_1 Yes_2 No_2 p
    A 25 35 45 78
    B 41 19 100 23
    C 33 27 81 42
    etc for multiple rows... I wish to compare homogeneity (i.e. are there equal proportions between groups 1 and 2 as represented by Yes_1 and No_1 and Yes_2 and No_2 respectively) between the two groups for each variable and would like to use the chi squared test. This would effectively be creating multiple 2x2 contingency tables, performing a chi square test on each and storing the output in a new column (like the empty p above). I can see that
    Code:
    tabi 25 35 \ 45 78, chi2
    will do the simple maths that I want for variable A, but I can't see how to apply them to multiple pieces of aggregated data like this nor how to save the calculated probability in column p. Advice appreciated (Stata/MP 13.0)
    Last edited by James Webb; 29 Sep 2016, 02:51. Reason: STATA --> Stata

  • #2
    Cross-posted on Cross Validated http://stats.stackexchange.com/quest...uared-in-stata

    Although you were (well) advised to post here (the post on CV is likely to be closed as off-topic) Statalist policy on cross-posting is explicit:

    8. May I cross-post to other forums?

    People posting on Statalist may also post the same question on other listservers or in web forums. There is absolutely no rule against doing that.

    But if you do post elsewhere, we ask that you provide cross-references in URL form to searchable archives. That way, people interested in your question can quickly check what has been said elsewhere and avoid posting similar comments. Being open about cross-posting saves everyone time.

    If your question was answered well elsewhere, please post a cross-reference to that answer on Statalist.
    Much of the FAQ Advice also applies to your question, including #12 and #18. See http://www.statalist.org/forums/help

    That said, this shows some technique:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 variable byte(yes_1 no_1 yes_2 no_2)
    "A" 25 35  45 78
    "B" 41 19 100 23
    "C" 33 27  81 42
    end
    
    gen p = .
    
    quietly forval i = 1/`=_N' {
        tabi `=yes_1[`i']' `=no_1[`i']' \ `=yes_2[`i']' `=no_2[`i']', chi2
        replace p = r(p) in `i'
    }
    
    list 
    
         +---------------------------------------------------+
         | variable   yes_1   no_1   yes_2   no_2          p |
         |---------------------------------------------------|
      1. |        A      25     35      45     78   .5067174 |
      2. |        B      41     19     100     23    .050197 |
      3. |        C      33     27      81     42   .1549756 |
         +---------------------------------------------------+
    See help macro and the documentation on tabi.


    Comment


    • #3
      Thank you

      Your code does exactly what I wanted.

      Comment

      Working...
      X