Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need to tab multiple variables

    I wonder if someone can help? I am a new Stata user and have been struggling to find the solution to my problem in the help menus as I don't know the language to search with yet!

    I have a large dataset of individuals with multiple questions answered by them. I am looking at the answers to ten questions on different types of injury- I have worked out the total number of 'yes' answers for each injury, but I need to make a table with the number of individuals who had any of the ten injuries. I have a unique identifier, but I cannot work out how to get Stata to give me the 'any injury' number.....I have tried by , list , tab, summarize with if as a condition and still am getting no joy.....help please!

  • #2
    Hi Jocelyn,

    you have dichotomous variables for each injury and you want the number of people who gave at least one positive answer?

    Comment


    • #3
      I have yes, no, missing, withdrawn for each injury coded as 1,2, dot and 99....and yes, who gave at least one positive answer.

      Comment


      • #4
        you don't tell us what the varnames are but here are two ways:
        Code:
        assume the varnames are var1-var10:
        
        gen byte anyyes=0
        forval i=1/10 {
        replace anyyes=1 if var`i'==1
        }
        
        then you can just tab anyyes
        
        another way would be to use inlist as in
        
        gen byte anyyes=inlist(1,var1, var2, etc)
        
        the latter technique might require two inlists because the list of 10 might be too long

        Comment


        • #5
          tabm from tab_chi (SSC) may be what you seek. In your case, be sure to note the missing option.

          Code:
          ssc inst tab_chi
          
          clear
          set seed 2803
          set obs 100
          
          forval j = 1/10 {
              gen y`j' = runiform() > (`j') / 11
          }
          
          tabm y1-y10
          
          
          . tabm y1-y10
          
                     |        values
            variable |         0          1 |     Total
          -----------+----------------------+----------
                  y1 |         6         94 |       100
                  y2 |         9         91 |       100
                  y3 |        23         77 |       100
                  y4 |        32         68 |       100
                  y5 |        48         52 |       100
                  y6 |        52         48 |       100
                  y7 |        64         36 |       100
                  y8 |        76         24 |       100
                  y9 |        84         16 |       100
                 y10 |        88         12 |       100
          -----------+----------------------+----------
               Total |       482        518 |     1,000
          Last edited by Nick Cox; 30 Jun 2015, 07:56.

          Comment


          • #6
            note that Nick and I clearly interpreted your question differently - you will have to decide which of us is "right"

            Comment


            • #7
              Thank you - am trying out both now!

              Comment


              • #8
                Rich is probably right here: I was reacting instinctively to your title, although your content is quite different. Nevertheless, tabm should still be invaluable!

                With the same wild guesses about variable names,

                Code:
                egen any = anymatch(y1-y10), value(1)
                su any
                is another way to do it. In effect, it's a canned way of implementing what Rich did from first princoples.
                Last edited by Nick Cox; 30 Jun 2015, 08:37. Reason: Corrected: anycount to anymatch

                Comment


                • #9
                  This is wonderful! This [Rich's answer] is exactly what I needed....I have been looking for so long through the help menus, thank you so much for taking the time to explain it to me.

                  Comment


                  • #10
                    Oh hang on.....this doesn't get around my initial problem does it? This is a good way to get Stata to count the total number of injuries, but what I want is the total number of individuals that had an injury....I want to know the count of the unique identifiers with anyyes=1, not the count of anyyes.....how do I do that?

                    Comment


                    • #11
                      The problem is one individual can have suffered 3 different injuries.

                      Comment


                      • #12
                        Quite correct. Your #10 and #11 crossed with my edit as I realised my error.

                        Another answer is to look at help egen to see what is available. The two functions are similar and documented together.

                        Comment


                        • #13
                          now I am confused; from you original post (#1), I had thought the data were in wide form and that is what I assumed when answering; however, I am now unclear about what your dataset looks like; if you would post (using CODE block delimiters - see the FAQ), a small amount of your data that would greatly help

                          note that if your data is in wide form (one row/person), then my original suggestion gives you want you want - the number of people with at least one injury

                          Comment

                          Working...
                          X