Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Measuring the combination of individual components of a compound variable?

    My datasets contains information on five disease conditions: Arthritis, COPD, Diabetes, Cancer, HBP.

    Disease status is either Yes (1) or No (0), so the total number of conditions can range from 0 to 5. Now I'd like to measure the percentage of the different combinations e.g. what % of patients have all the conditions, or Arthritis + Diabetes, or Arthritis + COPD + Diabetes and so on.

    Thanks.

  • #2
    one "easy" way:
    Code:
    egen group=group(Arthritis COPD Diabetes Cancer HBP)
    and then just tabulate the results

    see
    Code:
    help egen
    in case you want some of the offered bells and whistles

    note that I assumed your variable names were the disease conditions; if not, just change the command above to use your actual varnames

    Comment


    • #3
      Hi Goldstein, thanks so much for the input. I think I was not able to make the question adequately clear. The objective is to understand which disease combinations are most prevalent (i.e. whether HBP is more likely to coexist with Diabetes or CVD and so on). This is why I have to see the exact combinations of the diseases and not only the frequency. Here is a glimpse of the output. Currently it only shows the frequencies.

      group(HBP
      DIABETES
      CVD COPD
      CANCER
      DEPRESSION
      ARTHRITIS) Freq. Percent Cum.

      1 3,053 65.97 65.97
      2 114 2.46 68.43
      3 39 0.84 69.27
      4 7 0.15 69.43
      5 5 0.11 69.53

      Please let me know if this is feasible. Thanks.

      Comment


      • #4
        Rich Goldstein gave the same advice I would have given, and it is difficult to reconcile the results you are getting with that code. Perhaps there is something peculiar about your data. Please use the -dataex- command to post an example of your data. Also show the exact -egen- and -tab- commands that led to the output you got. Then perhaps we can figure out why this code is not doing what you want and find a solution.

        If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        When asking for help with code, always show example data. When showing example data, always use -dataex-.

        Comment


        • #5
          It seems that Sonnen just wanna see out an illustration.
          Code:
          egen Combine = concat(Arthritis COPD Diabetes Cancer HBP)
          tab Combine, sort

          Comment


          • #6
            Thanks a lot Clyde! I'll give a try.

            Comment


            • #7
              Originally posted by Romalpa Akzo View Post
              It seems that Sonnen just wanna see out an illustration.
              Code:
              egen Combine = concat(Arthritis COPD Diabetes Cancer HBP)
              tab Combine, sort
              Hi Romalpa, thanks indeed for the idea. I never used this method but seems like it is doing the trick. So if I'm reading the combine column correctly, the last row (11010 1 0.02 100.00) is telling that the combined percentage of DIABETES CVD and CANCER is 0.02. Is it so?


              . egen Combine = concat( DIABETES CVD COPD CANCER DEPRESSION )
              . tab Combine, sort

              Combine Freq. Percent Cum.

              00000 4,253 91.90 91.90
              01000 123 2.66 94.55
              10000 91 1.97 96.52
              .
              .
              .
              .
              010101 0.02 99.96
              110011 0.02 99.98
              110101 0.02 100.00

              Last edited by Sonnen Blume; 13 Oct 2018, 18:47.

              Comment


              • #8
                Yes, it is.

                Comment


                • #9
                  Originally posted by Romalpa Akzo View Post
                  Yes, it is.
                  Thanks!

                  Comment

                  Working...
                  X