Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • count data analysis

    Dear All
    Anyone can help me to do the basic descriptive data analysis and then run a regression model to predict the statistically significant variable in predicting the disease?
    The table shows clinical presentation for subcategories of a group of diseases called (PID: primary immunodeficiency) and the numbers represent the frequency of each clinical presentation in each subcategory of a total of 467 patients
    Click image for larger version

Name:	stata1.jpg
Views:	3
Size:	392.5 KB
ID:	1709069
    Click image for larger version

Name:	stata2.jpg
Views:	1
Size:	381.2 KB
ID:	1709074
    Attached Files

  • #2
    Before posting on the Stata Forum you should have read the forum's FAQ -- your post shows that you didn't do that. In your case you should carefully read section 12 "What should I say about the commands and data I use?".

    Comment


    • #3
      Dear Dirk
      Thanks for the guidance, based on FAQ 12.2, this is my data!
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long PIDcat byte(SCID CID synCID AB dysreg Phagocytic innate complement AIS BMF) int Total
      38 89 66 42 79 65 50 22 13 40 1 467
      27 77 45 23 63  8 14 10  3 14 1 258
      35 39 28 21 50  7 14 10  3  9 1 182
      50 33 12  5  0  5  0  0  0  5 0  60
      34  6  3  0  0  0  0  0  0  0 0   9
      42  3  3  0 12  1  0  0  0  0 0  19
       9  3  1  0  0  0  0  0  0  0 0   4
       4  2  3  0  0  1  0  0  0  0 0   6
      12  2  3  3  4  1  2  1  0  0 0  16
      30  0  3  1 36  0  2  0  4  0 0  46
      19 59 32  8 20 22 25 12  5 29 1 213
      47 26  8  0  0  0  0  5  0  0 0  39
      15 27 10  5 12  4 10  4  3  6 1  82
      18 17 15  3 10 18 19  7  0  7 1  97
      49  1  3  0  1  4  4  0  0  1 0  14
      48  3  2  3 12  2  3  4  1  4 0  34
      20  9  3  2  5 11  6  4  2  4 0  46
      44 17  5  5  3  5  4  3  2  2 0  46
      52  1  2  3  0  5  2  0  0  2 0  15
      51  4  3  3  0  2 10  4  0  2 0  28
      17  7  1  0  0  0  0  0  0  0 0   8
      32  9  0  0  0  0  0  0  0  0 0   9
      26  5  3  3 12 28 12  8  3  8 0  82
      53 11  4  1  2 38  8  6  1  3 0  74
      45 16  2  2  1 45 12  8  1  4 0  91
       1 16  6 14 24 22 10  7 10 29 0 138
      43  8  4  6 11  6  6  3  3  8 0  55
       2  3  1  2  2  5  0  0  2  3 0  18
      39  1  0  4  2  3  0  2  3  1 0  16
      41  1  0  1  2  5  0  0  0 14 0  23
      36  0  0  0  1  1  0  0  2  1 0   5
      11  0  0  0  4  0  0  0  0  0 0   4
       3  0  0  1  0  2  0  0  0  0 0   3
       2  3  1  0  2  0  4  2  0  2 0  14
       7 21 12 13 23 10  8  5  4  0 0  96
      14 12 13 22 12 24  9  2  2  3 1 100
      28  2  0  3  0  0  0  0  0  0 0   5
      21  4  1  0  0 46  1  2  0  2 0  56
      25 89 66 23 65 47  0  1  0  2 1 294
       5 63 46 21 47 49 50 22  5  4 1 308
       6 61 35 15  8 39  0  3  0  2 1 164
      40 60 27  7  6 34 50 13  0  2 1 200
      22 65 27  1  0 22 25  4  0  0 0 144
      24 13  7  6  8 35 10  2  3 10 0  94
      10  2  0  0  3  3  1  0  0  8 0  17
      23  6  2  4  0  1 13  2  0  0 0  28
      13  2  5  1  0  3  1  0  0  0 0  12
      16  2  1  1  2 12  0  0  0  0 0  18
       8 17  3  0  0  0  8  5  0  0 0  33
      29  8  5  1 10  4  5  4  3  3 0  43
      37 13  5  1  9  3  8  5  1  3 1  49
      33  0  0  0  0  0  2  1  0  1 0   4
      31  4  0  2  0  3  9  0  0  3 0  21
      46  2  1  0  1  4  0  0  0  0 0   8
      end
      label values PIDcat PIDcat
      label def PIDcat 1 "AID", modify
      label def PIDcat 2 "AIE", modify
      label def PIDcat 3 "AIH", modify
      label def PIDcat 4 "AP", modify
      label def PIDcat 5 "Ab", modify
      label def PIDcat 6 "Antiviral", modify
      label def PIDcat 7 "Atopy", modify
      label def PIDcat 8 "BCGosis", modify
      label def PIDcat 9 "BO", modify
      label def PIDcat 10 "Biologics", modify
      label def PIDcat 11 "CD", modify
      label def PIDcat 12 "CLD", modify
      label def PIDcat 13 "CMV", modify
      label def PIDcat 14 "CNS", modify
      label def PIDcat 15 "Chrdiarrhea", modify
      label def PIDcat 16 "EBV", modify
      label def PIDcat 17 "FMGvHD", modify
      label def PIDcat 18 "FTT", modify
      label def PIDcat 19 "GI", modify
      label def PIDcat 20 "GIinfections", modify
      label def PIDcat 21 "HLH", modify
      label def PIDcat 22 "HSCT", modify
      label def PIDcat 23 "IFI", modify
      label def PIDcat 24 "IS", modify
      label def PIDcat 25 "IVIG", modify
      label def PIDcat 26 "LN", modify
      label def PIDcat 27 "LRT", modify
      label def PIDcat 28 "Malignancy", modify
      label def PIDcat 29 "Meningitis", modify
      label def PIDcat 30 "OM", modify
      label def PIDcat 31 "Organabscess", modify
      label def PIDcat 32 "Osphenotype", modify
      label def PIDcat 33 "Osteomyelitis", modify
      label def PIDcat 34 "PCjP", modify
      label def PIDcat 35 "Pneumonia", modify
      label def PIDcat 36 "SLE", modify
      label def PIDcat 37 "Septicemia", modify
      label def PIDcat 38 "Totalobs", modify
      label def PIDcat 39 "Vasculitis", modify
      label def PIDcat 40 "antifungal", modify
      label def PIDcat 41 "arthritis", modify
      label def PIDcat 42 "bronchiactasis", modify
      label def PIDcat 43 "cytopenias", modify
      label def PIDcat 44 "dermatitis", modify
      label def PIDcat 45 "hepatomegaly", modify
      label def PIDcat 46 "necFasciitis", modify
      label def PIDcat 47 "oralcand", modify
      label def PIDcat 48 "oralulcers", modify
      label def PIDcat 49 "perianald", modify
      label def PIDcat 50 "pneumonitis", modify
      label def PIDcat 51 "skinabs", modify
      label def PIDcat 52 "skinulcers", modify
      label def PIDcat 53 "splenomegaly", modify
      ------------------ copy up to and including the previous line ------------------

      Comment


      • #4
        Now the question is: What do you want to describe and what is your prediction model (i.e. what do you want to predict with which variables)?
        ... predict the statistically significant variable ...
        sounds strange to me: Only a predictor can be "statistically significant", and whether it is statistically significant depends on the other variables in your model.

        Comment


        • #5
          For example, the most common presentation of all subcategories is LRT ( lower respiratory tract) in 258 patients out of 467. Is this statistically significant compared to other variables? like anyone presented in 30 or 50 patients?
          Thanks for your patience!

          Comment


          • #6
            I can't help you here because I have no medical expertise (e.g. I don't know what CID means and can't formulate expectations about frequencies and relationships between diagnoses and symptoms, which is actually your task). But without any hypothesis (for example a uniform distribution of the categories) you can't test anything.

            If CID represents the frequencies of the categories of PID (note, however, that there are two rows with a PIDcat code "AIE"; also there is a row with a PIDcat code "totalobs"), you can obtain a simple description of PIDcat by using CID as a frequency weight:
            Code:
            . tabulate PIDcat [fw=CID]
            
                    PIDcat |      Freq.     Percent        Cum.
            ---------------+-----------------------------------
                       AID |          6        1.14        1.14
                       AIE |          2        0.38        1.52
                        AP |          3        0.57        2.08
                        Ab |         46        8.71       10.80
                 Antiviral |         35        6.63       17.42
                     Atopy |         12        2.27       19.70
                   BCGosis |          3        0.57       20.27
                        BO |          1        0.19       20.45
                       CLD |          3        0.57       21.02
                       CMV |          5        0.95       21.97
                       CNS |         13        2.46       24.43
               Chrdiarrhea |         10        1.89       26.33
                       EBV |          1        0.19       26.52
                    FMGvHD |          1        0.19       26.70
                       FTT |         15        2.84       29.55
                        GI |         32        6.06       35.61
              GIinfections |          3        0.57       36.17
                       HLH |          1        0.19       36.36
                      HSCT |         27        5.11       41.48
                       IFI |          2        0.38       41.86
                        IS |          7        1.33       43.18
                      IVIG |         66       12.50       55.68
                        LN |          3        0.57       56.25
                       LRT |         45        8.52       64.77
                Meningitis |          5        0.95       65.72
                        OM |          3        0.57       66.29
                      PCjP |          3        0.57       66.86
                 Pneumonia |         28        5.30       72.16
                Septicemia |          5        0.95       73.11
                  Totalobs |         66       12.50       85.61    <-- ?
                antifungal |         27        5.11       90.72
            bronchiactasis |          3        0.57       91.29
                cytopenias |          4        0.76       92.05
                dermatitis |          5        0.95       92.99
              hepatomegaly |          2        0.38       93.37
              necFasciitis |          1        0.19       93.56
                  oralcand |          8        1.52       95.08
                oralulcers |          2        0.38       95.45
                 perianald |          3        0.57       96.02
               pneumonitis |         12        2.27       98.30
                   skinabs |          3        0.57       98.86
                skinulcers |          2        0.38       99.24
              splenomegaly |          4        0.76      100.00
            ---------------+-----------------------------------
                     Total |        528      100.00
            or you can install -fre- from SSC and create a table of weighted frequencies in descending order:
            Code:
            . cap which fre
            . if _rc ssc install fre // install fre if necessary
            
            . fre PIDcat [fw=CID], desc all
            
            PIDcat
            -----------------------------------------------------------------------
                                      |      Freq.    Percent      Valid       Cum.
            --------------------------+--------------------------------------------
            Valid   25 IVIG           |         66      12.50      12.50      12.50
                    38 Totalobs       |         66      12.50      12.50      25.00    <-- ?
                    5  Ab             |         46       8.71       8.71      33.71
                    27 LRT            |         45       8.52       8.52      42.23
                    6  Antiviral      |         35       6.63       6.63      48.86
                    19 GI             |         32       6.06       6.06      54.92
                    35 Pneumonia      |         28       5.30       5.30      60.23
                    22 HSCT           |         27       5.11       5.11      65.34
                    40 antifungal     |         27       5.11       5.11      70.45
                    18 FTT            |         15       2.84       2.84      73.30
                    14 CNS            |         13       2.46       2.46      75.76
                    7  Atopy          |         12       2.27       2.27      78.03
                    50 pneumonitis    |         12       2.27       2.27      80.30
                    15 Chrdiarrhea    |         10       1.89       1.89      82.20
                    47 oralcand       |          8       1.52       1.52      83.71
                    24 IS             |          7       1.33       1.33      85.04
                    1  AID            |          6       1.14       1.14      86.17
                    13 CMV            |          5       0.95       0.95      87.12
                    29 Meningitis     |          5       0.95       0.95      88.07
                    37 Septicemia     |          5       0.95       0.95      89.02
                    44 dermatitis     |          5       0.95       0.95      89.96
                    43 cytopenias     |          4       0.76       0.76      90.72
                    53 splenomegaly   |          4       0.76       0.76      91.48
                    4  AP             |          3       0.57       0.57      92.05
                    8  BCGosis        |          3       0.57       0.57      92.61
                    12 CLD            |          3       0.57       0.57      93.18
                    20 GIinfections   |          3       0.57       0.57      93.75
                    26 LN             |          3       0.57       0.57      94.32
                    30 OM             |          3       0.57       0.57      94.89
                    34 PCjP           |          3       0.57       0.57      95.45
                    42 bronchiactasis |          3       0.57       0.57      96.02
                    49 perianald      |          3       0.57       0.57      96.59
                    51 skinabs        |          3       0.57       0.57      97.16
                    2  AIE            |          2       0.38       0.38      97.54
                    23 IFI            |          2       0.38       0.38      97.92
                    45 hepatomegaly   |          2       0.38       0.38      98.30
                    48 oralulcers     |          2       0.38       0.38      98.67
                    52 skinulcers     |          2       0.38       0.38      99.05
                    9  BO             |          1       0.19       0.19      99.24
                    16 EBV            |          1       0.19       0.19      99.43
                    17 FMGvHD         |          1       0.19       0.19      99.62
                    21 HLH            |          1       0.19       0.19      99.81
                    46 necFasciitis   |          1       0.19       0.19     100.00
                    Total             |        528     100.00     100.00          
            -----------------------------------------------------------------------
            If you can state your research question more clearly, perhaps others can give you better advice.
            Last edited by Dirk Enzmann; 09 Apr 2023, 17:26.

            Comment


            • #7
              Dear Dirk
              thanks for your help
              I will follow your advice and test my hypothesis
              I appreciate it

              Comment

              Working...
              X