Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • can I use (d)table or must I go back to collect

    for a single group study, we are at a stage where my clients and I want to look at some descriptive stats; there are some boundary issues with some of the variables so the table showing results needs to show more than one result for certain variables; for example, say we are using the auto dataset (sysuse auto) and we want results (say, mean and sd) about, say, price under two conditions: (1) all 74 observations; (2) just the 69 observations that do not have a missing value for rep78; I cannot think of a way to do this with dtable (or with table) - but I would be glad to be proven wrong; if I have to step back to the underlying collect system, how would I do that?

  • #2

    Here is an idea, based on your description
    Code:
    sysuse auto
    
    * generate indicator for sample
    gen sample = !missing(rep78)
    * define custom label for sample sets of observations
    label define sample 1 "Complete"
    label values sample sample
    label var sample "Sample"
    
    * compute statistics
    dtable price, by(sample)
    
    * select which samples to show in the table
    collect style autolevels sample 1 .m, clear
    * replay table
    collect layout
    Here is the resulting table.
    Code:
    -------------------------------------------------
                             Sample
                 Complete               Total
    -------------------------------------------------
    N                69 (93.2%)           74 (100.0%)
    Price 6,146.043 (2,912.440) 6,165.257 (2,949.496)
    -------------------------------------------------

    Comment


    • #3
      Jeff Pitblado (StataCorp) - thank you for this which does do what I asked and is ingenious!; however, since there is more than one variable with this problem and since the boundaries differ by variable, I think it might work better with separate rows; e.g., say there was a second variable, say, mpg where the "sample" for mpg would exclude those with an pg=12 - of course, since I would never have thought of what you did, maybe you have a way to extend that also - but note that there are more than 2 variables with this issue

      Comment


      • #4
        You will probably need several dtable calls and then combine the results. Of course, the sample sizes differ by variable per your description.

        Code:
        sysuse auto
        collect clear
        
        * generate indicator for sample
        gen sample = !missing(rep78)
        * define custom label for sample sets of observations
        label define sample 1 "Complete"
        label values sample sample
        label var sample "Sample"
        
        * compute statistics
        dtable price, by(sample) name(s1)
        
        replace sample= mpg!=12
        dtable mpg, by(sample) name(s2)
        
        
        collect combine all= s1 s2
        
        * select which samples to show in the table
        collect style autolevels sample 1 .m, clear
        
        * replay table
        collect layout (var) (sample#result)
        Res.:

        Code:
        . collect layout (var) (sample#result)
        
        Collection: all
              Rows: var
           Columns: sample#result
           Table 1: 2 x 2
        
        ---------------------------------------------------------
                                         Sample                  
                             Complete               Total        
        ---------------------------------------------------------
        Mileage (mpg)        21.556 (5.649)        21.297 (5.786)
        Price         6,146.043 (2,912.440) 6,165.257 (2,949.496)
        ---------------------------------------------------------

        Comment


        • #5
          Andrew Musau - thanks, once I have the data clean I will try this

          Comment

          Working...
          X