Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Chi square tests on many variables in a panel dataset

    Dear all, I have a panel dataset and I am trying to do simple chi-square tests by sex on about 140 variables. The output I need is the frequency and % of respondents who mentioned each of the 140 variables and the chi square. I am using version 14. I have tried mrtab, tabchi and tabm but these don't give me the respondent level results unless I maybe need to combine these with other commands?

    I am using the following commands but getting errors.

    Code:
    . xtset RESPONDENT_ID
    
    . foreach var of varlist earlymaturity resistantpestdis resistantwind  adaptable {
     2.  xttab `var'
     3.  }
    ***This loop above works fine, however when I add sex and chi I get errors below:

    Code:
    . foreach var of varlist earlymaturity resistantpestdis resistantwind  adaptable  {
    2.   xttab `var' sex, chi
    3.    }
    too many variables specified
    r(103);


    This code with tabchi also works however results are not respondent level....
    Code:
    . foreach var of varlist  earlymaturity resistantpestdis resistantwind  adaptable    {
        2.    tabchi sex `var'
        3.    }
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int RESPONDENT_ID long sex byte(earlymaturity resistantpestdis resistantwind  adaptable)
    1 3 0 0 0 0
    1 3 0 0 0 0
    1 3 0 1 0 0
    2 2 0 0 0 0
    2 2 0 0 0 0
    3 3 0 0 0 0
    3 3 0 0 0 0
    3 3 0 0 0 0
    3 3 0 0 0 0
    4 3 0 0 0 0
    4 3 0 0 0 0
    4 3 0 1 0 0
    4 3 0 0 0 0
    4 3 0 0 0 0
    4 3 0 1 0 0
    4 3 0 0 0 0
    4 3 0 0 0 0
    5 2 0 0 1 0
    5 2 0 1 0 0
    5 2 0 0 0 0
    6 3 0 0 0 0
    6 3 0 0 0 0
    6 3 0 0 0 0
    6 3 1 0 0 0
    7 2 0 0 0 0
    7 2 0 0 0 0
    7 2 0 0 0 0
    7 2 0 1 0 0
    8 3 0 0 0 0
    8 3 1 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 1 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 0 0 0 0
    8 3 1 0 0 0
    9 2 0 0 0 0
    9 2 0 0 0 0
    9 2 0 0 0 0
    10 3 0 0 0 0
    10 3 0 0 0 0
    10 3 0 0 0 0
    10 3 0 0 0 0
    12 3 0 0 0 0
    12 3 1 0 0 0
    12 3 0 0 0 0
    13 2 0 0 0 0
    13 2 0 0 0 0
    13 2 0 0 0 0
    13 2 0 0 0 0
    13 2 1 0 0 0
    14 3 0 0 0 0
    14 3 0 0 0 0
    14 3 0 0 0 0
    15 2 0 0 0 0
    16 3 0 0 0 0
    16 3 0 0 0 0
    16 3 0 0 0 0
    16 3 0 0 0 0
    16 3 0 1 0 0
    16 3 0 0 0 0
    17 2 0 0 0 0
    17 2 0 0 0 0
    17 2 0 0 0 0
    17 2 0 0 0 0
    17 2 0 0 0 0
    17 2 0 0 0 0
    18 3 1 0 0 0
    18 3 0 0 0 0
    18 3 0 0 0 0
    18 3 1 0 1 1
    18 3 1 0 0 0
    18 3 1 0 0 0
    18 3 1 1 0 0
    18 3 0 1 0 0
    18 3 0 1 0 0
    18 3 0 0 0 0
    19 2 0 0 0 0
    19 2 0 0 0 0
    19 2 0 0 0 0
    19 2 1 0 0 0
    20 2 0 0 0 0
    20 2 1 0 0 0
    20 2 0 0 0 0
    20 2 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    21 3 0 0 0 0
    22 3 0 0 0 0
    end
    label values sex sex
    label def sex 2 "female", modify
    label def sex 3 "male", modify
    Looking forward to your responses. Thank you.
    Last edited by Pricilla Marimo; 20 Oct 2019, 08:57.

  • #2
    Pricilla:
    welcome to this forum,
    As per -xttab- help file, neither -freq-, nor -chi- seem to be available.
    Hence, the most you can do is:
    Code:
    . foreach var of varlist earlymaturity resistantpestdis resistantwind  adaptable  {
      2. bysort sex: tab `var'
      3. }
    
    --------------------------------------------------------------------------------------------------------
    -> sex = female
    
    earlymaturi |
             ty |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         29       90.63       90.63
              1 |          3        9.38      100.00
    ------------+-----------------------------------
          Total |         32      100.00
    
    --------------------------------------------------------------------------------------------------------
    -> sex = male
    
    earlymaturi |
             ty |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         58       85.29       85.29
              1 |         10       14.71      100.00
    ------------+-----------------------------------
          Total |         68      100.00
    
    
    --------------------------------------------------------------------------------------------------------
    -> sex = female
    
    resistantpe |
          stdis |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         30       93.75       93.75
              1 |          2        6.25      100.00
    ------------+-----------------------------------
          Total |         32      100.00
    
    --------------------------------------------------------------------------------------------------------
    -> sex = male
    
    resistantpe |
          stdis |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         61       89.71       89.71
              1 |          7       10.29      100.00
    ------------+-----------------------------------
          Total |         68      100.00
    
    
    --------------------------------------------------------------------------------------------------------
    -> sex = female
    
    resistantwi |
             nd |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         31       96.88       96.88
              1 |          1        3.13      100.00
    ------------+-----------------------------------
          Total |         32      100.00
    
    --------------------------------------------------------------------------------------------------------
    -> sex = male
    
    resistantwi |
             nd |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         67       98.53       98.53
              1 |          1        1.47      100.00
    ------------+-----------------------------------
          Total |         68      100.00
    
    
    --------------------------------------------------------------------------------------------------------
    -> sex = female
    
      adaptable |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         32      100.00      100.00
    ------------+-----------------------------------
          Total |         32      100.00
    
    --------------------------------------------------------------------------------------------------------
    -> sex = male
    
      adaptable |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |         67       98.53       98.53
              1 |          1        1.47      100.00
    ------------+-----------------------------------
          Total |         68      100.00
    
    
    .
    That will not give you what you're after.
    You may want to consider -xttrans-, provided that inference is probably only feasible by interacting -sex- with the variables you're interested in.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Please study the FAQ Advice more carefully. #12 explains that you should tell us where you found the community-contributed commands you're referring to. Thus mrtab is from the Stata Journal and tabchi and tabm come from tab_chi on SSC,

      The syntax of xttab is clear enough: it allows a single variable and supports no options and I can't tell why you are thinking otherwise. Hoping that there is undocumented syntax that does what you want is almost always doomed to failure.

      That said, I don't know either what respondent level results would be. If you want to run separate chi-square tests for each panel, any panel in which a variable is constant will give you an indeterminate result, e.g.

      Code:
      . tab sex earlymaturity if RESPONDENT_ID == 1 , chi
      
                 | earlymatur
                 |    ity
             sex |         0 |     Total
      -----------+-----------+----------
            male |         3 |         3
      -----------+-----------+----------
           Total |         3 |         3
      
      . ret li
      
      scalars:
                        r(N) =  3
                        r(r) =  1
                        r(c) =  1
      Your interest in mrtab and tabm suggests to me that you want to regard

      earlymaturity resistantpestdis resistantwind adaptable

      as multiple responses (e.g. favourable attributes of crops), but if so, looping over them can only produce separate results! Also, if you want to collapse each panel to single results, you would be ignoring serial dependence and thus invalidating significance tests. Despite reading your question several times, I still have no clear idea what you want. I don't think Carlo Lazzaro understands either!

      If no one else can follow I fear that you will have to explain much more, say by showing us a simple table for which you want a chi-square test.
      Last edited by Nick Cox; 21 Oct 2019, 02:31.

      Comment


      • #4
        Nick is right.
        I tried to give temptative advice that sounds already scant/unsatisfactory at typing.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Dear Carlo Lazzaro and Nick Cox

          Thank you for your comments and apologies if my question was not clear and the excluded information on the source of the community-contributed commands. Let me try to provide more information. The panel dataset I have has information on crop attributes mentioned by 1319 male and female respondents on different varieties they are growing - for example respondent 1 mentioned that they are growing 3 varieties and associated early maturity with one of the varieties; on the other hand, respondent 8 mentioned that they are growing 12 varieties and associated early maturity with 3 of them etc. FYI, the varieties are for the same crop.

          In addition (sorry this is not in the original question) I want to find out the average number of times that a particular attribute is associated with a specific variety and if this differs for men and women or other socioeconomic and demographic variables.


          What I want is a table of mentioned attributes, how many men (and % of men) mentioned the attribute, how many women (and % of women) mentioned the attribute and some statistical test. Please find below a table with an example of the results I am looking for (these were extracted from the code below the table)... What command do I need to run to get the chi and p value? Or perhaps do I need to run other tests that are not necessarily chi? Hope its clear now.

          Table 1: Crop attributes mentioned by men and women ....
          Crop attribute Number of men who mentioned attribute % of men who mentioned attribute Number of women who mentioned attribute % of women who mentioned attribute chi -square p-value
          Early maturity 4 33.33 3 33.33
          Resistant to pests and diseases 4 33.33 2 22.22
          Resistant to wind 1 8.33 1 11.11
          Adaptable to different soils 1 8.33 - -


          Code:
          ------------------------------------------------------------------------------------------------------------
            
           . xtset RESPONDENT_ID .    foreach var of varlist earlymaturity    resistantpestdis    resistantwind    adaptable    {
              2.
          .    bysort sex: xttab `var'
              3.
          .    }
          
          
          -> sex = female
          
                            Overall             Between            Within
          earlyma~y |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      29     90.63         9    100.00          92.22
                  1 |       3      9.38         3     33.33          23.33
          ----------+-----------------------------------------------------
              Total |      32    100.00        12    133.33          75.00
                                           (n = 9)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = male
          
                            Overall             Between            Within
          earlyma~y |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      58     85.29        12    100.00          88.89
                  1 |      10     14.71         4     33.33          33.33
          ----------+-----------------------------------------------------
              Total |      68    100.00        16    133.33          75.00
                                          (n = 12)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = female
          
                            Overall             Between            Within
          resista~s |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      30     93.75         9    100.00          93.52
                  1 |       2      6.25         2     22.22          29.17
          ----------+-----------------------------------------------------
              Total |      32    100.00        11    122.22          81.82
                                           (n = 9)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = male
          
                            Overall             Between            Within
          resista~s |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      61     89.71        12    100.00          91.25
                  1 |       7     10.29         4     33.33          26.25
          ----------+-----------------------------------------------------
              Total |      68    100.00        16    133.33          75.00
                                          (n = 12)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = female
          
                            Overall             Between            Within
          resista~d |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      31     96.88         9    100.00          96.30
                  1 |       1      3.13         1     11.11          33.33
          ----------+-----------------------------------------------------
              Total |      32    100.00        10    111.11          90.00
                                           (n = 9)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = male
          
                            Overall             Between            Within
          resista~d |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      67     98.53        12    100.00          99.17
                  1 |       1      1.47         1      8.33          10.00
          ----------+-----------------------------------------------------
              Total |      68    100.00        13    108.33          92.31
                                          (n = 12)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = female
          
                            Overall             Between            Within
          adaptable |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      32    100.00         9    100.00         100.00
          ----------+-----------------------------------------------------
              Total |      32    100.00         9    100.00         100.00
                                           (n = 9)
          
          ------------------------------------------------------------------------------------------------------------
          -> sex = male
          
                            Overall             Between            Within
          adaptable |    Freq.  Percent      Freq.  Percent        Percent
          ----------+-----------------------------------------------------
                  0 |      67     98.53        12    100.00          99.17
                  1 |       1      1.47         1      8.33          10.00
          ----------+-----------------------------------------------------
              Total |      68    100.00        13    108.33          92.31
                                          (n = 12)
          Last edited by Pricilla Marimo; 21 Oct 2019, 04:06.

          Comment


          • #6
            That helps a lot. It seems that your panels are people and the replicated observations are different varieties and what you are asking for is given by the results of

            Code:
            foreach v of var early-adapt { 
                tab sex `v', chi row col 
            }
            It's going to be easier for you to compile that table by hand than to work out code that gives it directly, I have to doubt that the analysis makes much sense if you are lumping together over such a variegated dataset and ignoring which person uses which, how many varieties are grown by each person, and so forth. The appropriate model is arguable but there should be a better model to use all the information.


            Comment

            Working...
            X