How can I display the number of non-missing observations for factor variables using dtable and collect?

Suzanna Vidmar

Join Date: Nov 2017
Posts: 27

How can I display the number of non-missing observations for factor variables using dtable and collect?

29 Jun 2023, 23:04

I would like to be able to display the number of non-missing values for each combination of factor variable and group allocation in the body of a descriptive table. I'm most of the way to creating the table I'm after and need help with the last steps. Alternatively, there may be a better, maybe even obvious, approach that I haven't seen.

Firstly, I'm creating a dataset containing some missing data.

Code:

*Reading in the dataset.
use https://www.stata-press.com/data/r18/exercise.dta,clear
keep if inlist(day, 0, 4, 6, 8)

*Replacing the variable program with group_allocation, where 1=intervention and 2=control.
gen group_allocation=program
lab var group_allocation "Group allocation"
lab def grplbl 1 Intervention 2 Control
lab val group_allocation grplbl
drop program

gen blinding=2
replace blinding=3 if _n/3==int(_n/3)
bysort day: replace blinding=1 if _n/4==int(_n/4) & day!=0

lab def blindlbl 1 "Change or lack of change from baseline" ///
                 2  "Participant informed assessor" ///
                 3  "Guess"
lab val blinding blindlbl

*Setting a few values of the variable blinding to missing.
replace blinding=. if inlist(id,1,2,18)

The tables below show that overall n is 16 and 21 for the Intervention and Control group, respectively. When counting for the cases where blinding is not missing, n is 14 and 20.
Note that I'm restricting to day=0 for my example but ultimately will be doing this for multiple time points (and also multiple variables).

Code:

. tabulate group_allocation if day==0

       Group |
  allocation |      Freq.     Percent        Cum.
-------------+-----------------------------------
Intervention |         16       43.24       43.24
     Control |         21       56.76      100.00
-------------+-----------------------------------
       Total |         37      100.00

. tabulate blinding group_allocation if day==0

                      |   Group allocation
             blinding | Intervent    Control |     Total
----------------------+----------------------+----------
Participant informed  |         9         14 |        23 
                Guess |         5          6 |        11 
----------------------+----------------------+----------
                Total |        14         20 |        34

There doesn't seem to be an easy way to count the number of non-missing observations for factor variables, but I can use the count statistic for continuous variables. Here I'm generating a copy of my variable "binding" that I will treat as continous.

Code:

*Generating a copy of the variable blinding, which will be treated as continuous
*in order to count the number of nonmissing values.
gen blinding_c=blinding

Here's my first attempt and the resulting table. The table is close to what I want, but I would like to indent the categories of my factor variable.

Code:

*Clearing any collections that may be in Stata's memory.
collect clear

*Creating a separate descriptive table for each level of time.
*For day=0, including the sample frequency in the column header.    
dtable blinding_c i.blinding if day==0, nolistwise ///
    by(group_allocation,nototal missing) ///
    sample(, statistic(frequency) place(seplabels)) ///
    sformat("n=%s" frequency) ///
    continuous(, statistic(count)) ///
    title(Table 5: Blinding of assessor) ///
    titlestyles(font(Calibri, size(12) bold)) ///
    name(Day0)
    
collect style cell result[count], sformat((n=%s))
collect style header blinding, title(hide)
collect label levels var blinding_c "Blinding",modify
collect style cell result, halign(center)
collect preview

Table 5: Blinding of assessor
-----------------------------------------------------
                                  Group allocation   
                              Intervention   Control 
                                  n=16        n=21   
-----------------------------------------------------
Blinding                         (n=14)      (n=20)  
Participant informed assessor   9 (64.3%)  14 (70.0%)
Guess                           5 (35.7%)   6 (30.0%)
-----------------------------------------------------

Here's my second attempt. The categories are indented but I have an extra row that I don't want and the (n=#)s are on the wrong row - I want them on the vargrp header row.

Code:

collect addtags vargrp[Blinding], fortags(var[blinding_c blinding])
collect layout (vargrp#var) (group_allocation#result)

Table 5: Blinding of assessor
-------------------------------------------------------
                                    Group allocation   
                                Intervention   Control 
-------------------------------------------------------
Blinding                                               
  Blinding                         (n=14)      (n=20)  
  Participant informed assessor   9 (64.3%)  14 (70.0%)
  Guess                           5 (35.7%)   6 (30.0%)
-------------------------------------------------------

How can I create a table with this structure?

Code:

Table 5: Blinding of assessor
-------------------------------------------------------
                                    Group allocation   
                                Intervention   Control 
-------------------------------------------------------
Blinding                           (n=14)      (n=20)  
  Participant informed assessor   9 (64.3%)  14 (70.0%)
  Guess                           5 (35.7%)   6 (30.0%)
-------------------------------------------------------

Kind regards,
Suzanna

Tags: None

Announcement

How can I display the number of non-missing observations for factor variables using dtable and collect?