dtable raw frequency + weighted proportion (95% CI) by wave

Markus Stratmann

Join Date: May 2025

Posts: 8
#1

dtable raw frequency + weighted proportion (95% CI) by wave

20 May 2025, 07:15

Hi!
I'm new to Stata and working with longitudinal data for the first time - so maybe I am missing something.
I have data from participants from up to 6 waves on their depression status (dep). With this data I classified them further as new depression cases (dep_inc) or persistent depression cases (dep_per) in the susequent waves (if data on two subsequent waves was available).

My primary goal is to get one table that gives me the raw frequency, and weighted proportion/prevalence including CI by waves.
My secondary goal is to get this for all variables of interest in the same table. But they need different weighting. For dep I have cross-sectional weights and for dep_inc and dep_per I use longitudinal weights. Further, data from the last two variables is only available for waves 2 to 6 as in the first wave no incidence or persistence could be assessed.

I am unable to provide you with my data or parts of my data. Therefore I'm providing the closest data I could find online.

Code:

// available dataset use https://www.stata-press.com/data/r19/nlswork.dta // add weighting variable (imagine cross-sectiona) set seed 18 gen double sampw = runiformint(10,50) + runiform() // add second weighting variable (imagine longitudinal) gen double sampw2 = runiformint(10,50) + runiform() svyset [pw=sampw2] // table for raw frequency and weighted proportion dtable i.c_city [pweight=sampw], by (year) // table for weighted proportion including confidence interval proportion i.c_city [pweight=sampw], over(year) level(95) // equivalent tables using the second weight dtable i.nev_mar i.msp [pweight=sampw2], by (year) proportion i.nev_mar i.msp [pweight=sampw2], over(year) level(95)

Is there a way to include all tables in the same table?
I'm using Stata 19.

Best,
Markus
Tags: categorical, confidence, panel data, table, weights

Andrew Musau

Join Date: Oct 2014
Posts: 10168

20 May 2025, 10:54

Thanks for the reproducible example. You can loop using collect get to retrieve the results from proportion, and then add them to the dtable collections. The results can be combined using collect combine.

Code:

// available dataset
use https://www.stata-press.com/data/r19/nlswork.dta, clear 
collect clear
keep if year<=70
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

// table for raw frequency and weighted proportion
dtable i.c_city [pweight=sampw], by(year) name(DTable1)

qui levelsof c_city, local(levs) 
qui levelsof year, local(years)
local i 0
foreach l of local levs{
    foreach y of local years{
        local ++i
        qui proportion i.c_city [pweight=sampw], over(year) level(95)
        collect get _r_lb= r(table)["ll", `i'], tags(year[`y'] var[`l'.c_city])
        collect get _r_ub= r(table)["ul", `i'], tags(year[`y'] var[`l'.c_city])
    }
}
collect composite define _r_ci = _r_lb _r_ub, trim
collect style cell result[_r_ci], sformat("[%s]")
collect style autolevels result fvfrequency fvpercent _r_ci, clear
collect layout

dtable i.nev_mar [pweight=sampw2], by(year) name(DTable2)
qui levelsof nev_mar, local(levs) 
qui levelsof year, local(years)
local i 0
foreach l of local levs{
    foreach y of local years{
        local ++i
        qui proportion i.nev_mar[pweight=sampw2], over(year) level(95)
        collect get _r_lb= r(table)["ll", `i'], tags(year[`y'] var[`l'.nev_mar])
        collect get _r_ub= r(table)["ul", `i'], tags(year[`y'] var[`l'.nev_mar])
    }
}
collect composite define _r_ci = _r_lb _r_ub, trim
collect style cell result[_r_ci], sformat("[%s]")
collect style autolevels result fvfrequency fvpercent _r_ci, clear
collect layout

collect combine all= DTable1 DTable2
collect layout (var) (year#result)

Res.:

Code:

. collect layout

Collection: DTable1
      Rows: var
   Columns: year#result
   Table 1: 3 x 11

-----------------------------------------------------------------------------------------------------------------------
                                                              Interview year                                           
                               68                           69                           70                   Total    
-----------------------------------------------------------------------------------------------------------------------
1 if central city                                                                                                      
  0               24,478 (57.6%) [0.548 0.604] 22,196 (57.6%) [0.546 0.605] 29,569 (58.2%) [0.557 0.607] 76,243 (57.8%)
  1               18,018 (42.4%) [0.396 0.452] 16,327 (42.4%) [0.395 0.454] 21,224 (41.8%) [0.393 0.443] 55,569 (42.2%)
-----------------------------------------------------------------------------------------------------------------------


. collect layout

Collection: DTable2
      Rows: var
   Columns: year#result
   Table 1: 3 x 11

------------------------------------------------------------------------------------------------------------------------
                                                               Interview year                                           
                                68                           69                           70                   Total    
------------------------------------------------------------------------------------------------------------------------
1 if never married                                                                                                      
  0                22,088 (54.2%) [0.513 0.570] 21,535 (56.3%) [0.533 0.592] 31,384 (61.2%) [0.586 0.636] 75,008 (57.5%)
  1                18,699 (45.8%) [0.430 0.487] 16,749 (43.7%) [0.408 0.467] 19,938 (38.8%) [0.364 0.414] 55,386 (42.5%)
------------------------------------------------------------------------------------------------------------------------

. 
. collect combine all= DTable1 DTable2
(current collection is all)

. collect layout (var) (year#result)

Collection: all
      Rows: var
   Columns: year#result
   Table 1: 6 x 11

------------------------------------------------------------------------------------------------------------------------
                                                               Interview year                                           
                                68                           69                           70                   Total    
------------------------------------------------------------------------------------------------------------------------
1 if central city                                                                                                       
  0                24,478 (57.6%) [0.548 0.604] 22,196 (57.6%) [0.546 0.605] 29,569 (58.2%) [0.557 0.607] 76,243 (57.8%)
  1                18,018 (42.4%) [0.396 0.452] 16,327 (42.4%) [0.395 0.454] 21,224 (41.8%) [0.393 0.443] 55,569 (42.2%)
1 if never married                                                                                                      
  0                22,088 (54.2%) [0.513 0.570] 21,535 (56.3%) [0.533 0.592] 31,384 (61.2%) [0.586 0.636] 75,008 (57.5%)
  1                18,699 (45.8%) [0.430 0.487] 16,749 (43.7%) [0.408 0.467] 19,938 (38.8%) [0.364 0.414] 55,386 (42.5%)
------------------------------------------------------------------------------------------------------------------------

Comment

Markus Stratmann

Join Date: May 2025

Posts: 8
#3

21 May 2025, 02:09

Thank you so much, Andrew! This is helping me a lot!

I just have one more smaller issue: for varialbes that are only available for later waves I receive this error: conformability error r(503);
My guess is that this is because I only have data for waves 2 to 6. The dtable command is giving me information on the first wave nonetheless, proportion is ommiting wave 1. Therefore the tables do not have the same number of collumns.
Is there a more elegant way of informing 'proportions' to give me data on the first wave than just creating a variable that also contains zeros for wave 1 instead of missing values?
missing option in proportion command - Statalist
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10168

21 May 2025, 03:46

My looping strategy does not apply listwise deletion of observations with missing values. In such cases, directly referencing the column names of the r(table) matrix combined with the capture command can help overcome this limitation.

Code:

// available dataset
use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars c_city nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var' [pweight=`wvar'], by(year, nototal) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof year, local(years)
    foreach l of local levs{
        foreach y of local years{
            qui proportion i.`var' [pweight=`wvar'], over(year) percent level(95)
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
        }
    }

    collect style cell result[fvpercent _r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect style autolevels result fvfrequency fvpercent _r_ci, clear
    collect layout
}
collect combine all= `vars'
collect layout (var) (year#result)

Res.:

Code:

Collection: c_city
      Rows: var
   Columns: year#result
   Table 1: 3 x 9

--------------------------------------------------------------------------------------------------------
                                                      Interview year                                    
                               68                           69                           70            
--------------------------------------------------------------------------------------------------------
1 if central city                                                                                      
  0               24,478 (57.6%) [54.8% 60.4%] 22,196 (57.6%) [54.6% 60.5%] 29,569 (58.2%) [55.7% 60.7%]
  1               18,018 (42.4%) [39.6% 45.2%] 16,327 (42.4%) [39.5% 45.4%] 21,224 (41.8%) [39.3% 44.3%]
--------------------------------------------------------------------------------------------------------


Collection: nev_mar
      Rows: var
   Columns: year#result
   Table 1: 3 x 8

-----------------------------------------------------------------------------------
                                            Interview year                        
                     68                69                           70            
-----------------------------------------------------------------------------------
1 if never married                                                                
  0                0 (.%) 21,535 (56.3%) [53.3% 59.2%] 31,384 (61.2%) [58.6% 63.6%]
  1                0 (.%) 16,749 (43.7%) [40.8% 46.7%] 19,938 (38.8%) [36.4% 41.4%]
-----------------------------------------------------------------------------------


Collection: msp
      Rows: var
   Columns: year#result
   Table 1: 3 x 9

-------------------------------------------------------------------------------------------------------------------
                                                                 Interview year                                    
                                          68                           69                           70            
-------------------------------------------------------------------------------------------------------------------
1 if married, spouse present                                                                                      
  0                          22,015 (54.0%) [51.1% 56.8%] 20,506 (53.6%) [50.6% 56.5%] 24,835 (48.4%) [45.8% 51.0%]
  1                          18,772 (46.0%) [43.2% 48.9%] 17,778 (46.4%) [43.5% 49.4%] 26,487 (51.6%) [49.0% 54.2%]
-------------------------------------------------------------------------------------------------------------------

. collect combine all= `vars'
(current collection is all)

. collect layout (var) (year#result)

Collection: all
      Rows: var
   Columns: year#result
   Table 1: 9 x 9

-------------------------------------------------------------------------------------------------------------------
                                                                 Interview year                                    
                                          68                           69                           70            
-------------------------------------------------------------------------------------------------------------------
1 if central city                                                                                                  
  0                          24,478 (57.6%) [54.8% 60.4%] 22,196 (57.6%) [54.6% 60.5%] 29,569 (58.2%) [55.7% 60.7%]
  1                          18,018 (42.4%) [39.6% 45.2%] 16,327 (42.4%) [39.5% 45.4%] 21,224 (41.8%) [39.3% 44.3%]
1 if married, spouse present                                                                                      
  0                          22,015 (54.0%) [51.1% 56.8%] 20,506 (53.6%) [50.6% 56.5%] 24,835 (48.4%) [45.8% 51.0%]
  1                          18,772 (46.0%) [43.2% 48.9%] 17,778 (46.4%) [43.5% 49.4%] 26,487 (51.6%) [49.0% 54.2%]
1 if never married                                                                                                
  0                               0    (.%)               21,535 (56.3%) [53.3% 59.2%] 31,384 (61.2%) [58.6% 63.6%]
  1                               0    (.%)               16,749 (43.7%) [40.8% 46.7%] 19,938 (38.8%) [36.4% 41.4%]

Last edited by Andrew Musau; 21 May 2025, 04:30.

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10168
#5

21 May 2025, 03:49

Of course, with multiple weight variables, you need to add extra conditions. See https://journals.sagepub.com/doi/pdf...867X0500500310 for more on the cond() function if you're not familiar with it.
2 likes
Comment

Markus Stratmann

Join Date: May 2025
Posts: 8

22 May 2025, 08:55

Thank you again!
I have noticed a discrepancy in the Total N per wave (weighted vs. unweighted). Currently, dtable automatically outputs the weighted frequency not the raw one frequency together with the weighted proportion,
My solution would be to calculate dtable unweighted for the raw frequency and pull the weighted proportion (just like the confidence intervals) from the proportions command.
Unfortunately, I am not getting along very well with the help pages to draw the correct values. I tried several commands but couldn't get it running (see code below).
Do you know what I need to adjust for pulling the weighted proportion?

For my final table I would like to have all values in one cell as: weighted percent [l_b-u_b%] (N=rough frequency): e.g., 46.0% [43.2-48.9%] (N=18,772)
Is this possible to set this up using the collect commands?

Code:

clear all

// available dataset
use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars c_city nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var', by(year, nototal) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof year, local(years)
    foreach l of local levs{
        foreach y of local years{
            qui proportion i.`var' [pweight=`wvar'], over(year) percent level(95) 
            //cap collect get propa=e(cmd)["_r_b", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var']) 
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
        }
    }

    collect style cell result[_r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, delimiter("-") trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect style autolevels result propa _r_ci fvfrequency, clear
    collect layout
}
collect combine all= `vars'
collect layout (var) (year#result)

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10168

22 May 2025, 15:39

The SEs are saved in the r(table) matrix as well, and can be extracted using:

Code:

cap collect get _r_se= r(table)["se", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])

Down the line:

Code:

collect style cell result[_r_lb _r_ub _r_se], sformat("%s%%")
collect style cell result[_r_se], sformat("(%s)")
collect style autolevels result _r_se _r_ci fvfrequency, clear

Comment

Markus Stratmann

Join Date: May 2025
Posts: 8

23 May 2025, 07:52

Thank you once again!
I was actually looking for the weighted proportion, not SEs, but I've found it saved as "b" - and it's running now smoothly! Thank you, it's saving a lot of time!

I was wondering about two more smaller things:

1. Is it possible to explicitly tell the collect command in which order i would like to arrange my rows. At the moment it is ordering them alphabetically by variable name. In the example dataset this is 'msp' then 'nev_mar'. Just putting the order upside down wouldn't work for my more complex dataset with more variables. Is there a way to explicitly specify the order?
I've read in this forum one possible solution but it seems to not work for me.

Code:

collect layout (var[nev_mar msp]) (race#result)

2. In the upper examples the collumn for Total was supressed. I would like to include it but noticed that the proportion command does not deliver total estimates when using 'over()'. My idea would be to use proportion once with 'over()' once without to extract the probability and confidence intervals for total as well as race. Do you have a suggestion of how I could include this into the loop to receive values for all variables?

Code:

clear all
// available dataset
use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var' if year == 69, by(race) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof race, local(races)
    foreach l of local levs{
        foreach y of local races{
            qui proportion i.`var' [pweight=`wvar'] if year == 69, over(race) percent level(95) 
            cap collect get _r_b= r(table)["b", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var']) 
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
        }
    }

    collect style cell result[_r_b _r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, delimiter("-") trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_b], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect composite define _r_prop = _r_b _r_ci, delimiter(" ") trim
    collect style autolevels result fvfrequency _r_prop, clear
    collect layout
}
collect combine all= `vars'
collect layout (var) (race#result)

proportion i.nev_mar [pweight=sampw] if year == 69, percent level(95)

Comment

Markus Stratmann

Join Date: May 2025

Posts: 8
#9

23 May 2025, 07:59

For the first question about the ordering I found one solution of not using the []. Then the table unfortunately supresses the variables name. Is there a way to have the right order and the name?

Code:

collect layout (nev_mar msp) (race#result)
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10168
#10

23 May 2025, 09:34

For ordering the levels of the categorical variables, you use collect style autolevels. Something like:

Code:

collect combine all= `vars' collect style autolevels var 0.msp 1.msp 0.c_city 1.c_city 0.nev_mar 1.nev_mar, clear collect layout (var) (race#result)

I don't really understand the second question. The observation counts differ across the categorical variables. Maybe use an example to manually show what you want.
1 like
Comment

Markus Stratmann

Join Date: May 2025
Posts: 8

#11

26 May 2025, 02:13

Thank you! Yes, of course.

At the moment I'm getting a table containing the fequency and proportion of nev_mar and msp by race.
I would like to have this table but with a 4th collum containing the total frequency, proportion accordingly. For now, I have the total frequency from dtable already.
To get the total proportion and confidence intervals I added the proportion command in the last line.
I am now wondering how to include the last line in the loop and how to inform the collect command to use these information in the last collum "Total".

Code:

clear all
// available dataset

use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var' if year == 69, by(race) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof race, local(races)
    foreach l of local levs{
        foreach y of local races{
            qui proportion i.`var' [pweight=`wvar'] if year == 69, over(race) percent level(95) 
            cap collect get _r_b= r(table)["b", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var']) 
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
        }
    }

    collect style cell result[_r_b _r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, delimiter("-") trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_b], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect composite define _r_prop = _r_b _r_ci, delimiter(" ") trim
    collect style autolevels result fvfrequency _r_prop, clear
    collect layout
}
collect combine all= `vars'
collect style autolevels var 0.nev_mar 1.nev_mar 0.msp 1.msp
collect layout (var) (race#result)


proportion i.nev_mar [pweight=sampw] if year == 69, percent level(95)

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10168

#12

26 May 2025, 15:09

"Total" has the level ".m" for the dimension race. See

Code:

collect levelsof race

Res.:

Code:

. collect levelsof race

Collection: all
 Dimension: race
    Levels:  1 2 3 .m

So you want to add:

Code:

foreach var in `vars'{
    qui levelsof `var', local(levs)
    foreach l of local levs{
        local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
        proportion i.`var' [pweight=`wvar'], percent level(95)
        cap collect get _r_b= r(table)["b", "`l'.`var'"], tags(race[.m] var[`l'.`var']) 
        cap collect get _r_lb= r(table)["ll","`l'.`var'"], tags(race[.m] var[`l'.`var'])
        cap collect get _r_ub= r(table)["ul", "`l'.`var'"], tags(race[.m] var[`l'.`var'])
    }
}

What is highlighted in blue becomes apparent after examining the column names of the r(table) matrix following the proportion command, e.g.,

Code:

proportion i.nev_mar, percent level(95)
mat l r(table)

Res.:

Code:

. mat l r(table)

r(table)[9,2]
                0.         1.
          nev_mar    nev_mar
     b  59.869774  40.130226
    se  .90739623  .90739623
     t  65.979747  44.225692
pvalue          0          0
    ll  58.078297  38.364735
    ul  61.635265  41.921703
    df       2917       2917
  crit  1.9607776  1.9607776
 eform          0          0

Comment

Markus Stratmann

Join Date: May 2025
Posts: 8

#13

27 May 2025, 02:05

Thank you! I now understand better how to inform the command to pull the correct information.

Code:

clear all
use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var' if year == 69, by(race) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof race, local(races)
    foreach l of local levs{
        foreach y of local races{
            qui proportion i.`var' [pweight=`wvar'] if year == 69, over(race) percent level(95) 
            cap collect get _r_b= r(table)["b", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var']) 
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.race"], tags(race[`y'] var[`l'.`var'])
        }
    }
foreach var in `vars'{
    qui levelsof `var', local(levs)
    foreach l of local levs{
        local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
        proportion i.`var' [pweight=`wvar'], percent level(95)
        cap collect get _r_b= r(table)["b", "`l'.`var'"], tags(race[.m] var[`l'.`var']) 
        cap collect get _r_lb= r(table)["ll","`l'.`var'"], tags(race[.m] var[`l'.`var'])
        cap collect get _r_ub= r(table)["ul", "`l'.`var'"], tags(race[.m] var[`l'.`var'])
    }
}

    collect style cell result[_r_b _r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, delimiter("-") trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_b], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect composite define _r_prop = _r_b _r_ci, delimiter(" ") trim
    collect style autolevels result fvfrequency _r_prop, clear
    collect layout
}
collect combine all= `vars'
collect style autolevels var 0.nev_mar 1.nev_mar 0.msp 1.msp
collect layout (var) (race#result)

In my final table, the newly drawn proportions for Total do not appear. Am I missing something?

Code:

. collect combine all= `vars'
(current collection is all)

. collect style autolevels var 0.nev_mar 1.nev_mar 0.msp 1.msp

. collect layout (var) (race#result)

Collection: all
      Rows: var
   Columns: race#result
   Table 1: 6 x 7

--------------------------------------------------------------------------------------------------------
                                                                 Race                                   
                                      White                   Black                  Other         Total
--------------------------------------------------------------------------------------------------------
1 if never married                                                                                      
  0                          520 58.9% [55.3%-62.3%] 175 50.0% [44.3%-55.7%] 6 42.1% [17.7%-71.0%]   701
  1                          357 41.1% [37.7%-44.7%] 168 50.0% [44.3%-55.7%] 6 57.9% [29.0%-82.3%]   531
1 if married, spouse present                                                                            
  0                          431 49.3% [45.8%-52.9%] 220 64.3% [58.7%-69.5%] 6 57.9% [29.0%-82.3%]   657
  1                          446 50.7% [47.1%-54.2%] 123 35.7% [30.5%-41.3%] 6 42.1% [17.7%-71.0%]   575
--------------------------------------------------------------------------------------------------------

. 
end of do-file

I've noticed I can acess complete collections for msp and nev_mar seperatly but when combining something goes off. I also compared the values I pulled with the values I get when I use the commands individually and found discrepancies. How is this possible?

Code:

Collection: nev_mar
      Rows: var
   Columns: race#result
   Table 1: 6 x 8

--------------------------------------------------------------------------------------------------------------------------
                                                                          Race                                            
                                      White                   Black                  Other                  Total         
--------------------------------------------------------------------------------------------------------------------------
1 if never married                                                                                                        
  0                          520 58.9% [55.3%-62.3%] 175 50.0% [44.3%-55.7%] 6 42.1% [17.7%-71.0%] 701 59.1% [57.1%-61.0%]
  1                          357 41.1% [37.7%-44.7%] 168 50.0% [44.3%-55.7%] 6 57.9% [29.0%-82.3%] 531 40.9% [39.0%-42.9%]
1 if married, spouse present                                                                                              
  0                                                                                                    51.7% [50.0%-53.3%]
  1                                                                                                    48.3% [46.7%-50.0%]
--------------------------------------------------------------------------------------------------------------------------

Collection: msp
      Rows: var
   Columns: race#result
   Table 1: 6 x 8

--------------------------------------------------------------------------------------------------------------------------
                                                                          Race                                            
                                      White                   Black                  Other                  Total         
--------------------------------------------------------------------------------------------------------------------------
1 if married, spouse present                                                                                              
  0                          431 49.3% [45.8%-52.9%] 220 64.3% [58.7%-69.5%] 6 57.9% [29.0%-82.3%] 657 51.7% [50.0%-53.3%]
  1                          446 50.7% [47.1%-54.2%] 123 35.7% [30.5%-41.3%] 6 42.1% [17.7%-71.0%] 575 48.3% [46.7%-50.0%]
1 if never married                                                                                                        
  0                                                                                                    59.1% [57.1%-61.0%]
  1                                                                                                    40.9% [39.0%-42.9%]
--------------------------------------------------------------------------------------------------------------------------



. proportion i.nev_mar [pweight=sampw2] if year == 69, percent level(95)

Percent estimation                       Number of obs = 1,232

--------------------------------------------------------------
             |                                   Logit
             |    Percent   Std. err.     [95% conf. interval]
-------------+------------------------------------------------
     nev_mar |
          0  |      56.25       1.52         53.25       59.20
          1  |      43.75       1.52         40.80       46.75
--------------------------------------------------------------

. proportion i.msp [pweight=sampw] if year == 69, percent level(95)

Percent estimation                       Number of obs = 1,232

--------------------------------------------------------------
             |                                   Logit
             |    Percent   Std. err.     [95% conf. interval]
-------------+------------------------------------------------
         msp |
          0  |      52.69       1.52         49.70       55.66
          1  |      47.31       1.52         44.34       50.30
--------------------------------------------------------------

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10168

#14

27 May 2025, 06:02

I do not know why you are specifying -if- conditions within the dtable to restrict computations by year when we are looping over years. So, instead of trying to figure out your code, I will revert to my example in #4. To add the CIs from the proportion command in the "Total" column, I would just add the following (highlighted) to the loop extracting statistics by year:

Code:

// available dataset
use https://www.stata-press.com/data/r19/nlswork.dta, clear
collect clear
keep if year<=70
replace nev_mar=. if year==68
// add weighting variable (imagine cross-sectiona)
set seed 18
gen double sampw = runiformint(10,50) + runiform()
// add second weighting variable (imagine longitudinal)
gen double sampw2 = runiformint(10,50) + runiform()
svyset [pw=sampw2]

local vars c_city nev_mar msp
foreach var of local vars{
    // table for raw frequency and weighted proportion
    local wvar= cond(inlist("`var'", "nev_mar", "msp"), "sampw2", "sampw")
    dtable i.`var' [pweight=`wvar'], by(year) name(`var')

    qui levelsof `var', local(levs)
    qui levelsof year, local(years)
    foreach l of local levs{
        foreach y of local years{
            qui proportion i.`var' [pweight=`wvar'], over(year) percent level(95)
            cap collect get _r_lb= r(table)["ll", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
            cap collect get _r_ub= r(table)["ul", "`l'.`var'@`y'.year"], tags(year[`y'] var[`l'.`var'])
        }
        qui proportion i.`var' [pweight=`wvar'], percent level(95)
        cap collect get _r_lb= r(table)["ll", "`l'.`var'"], tags(year[.m] var[`l'.`var'])
        cap collect get _r_ub= r(table)["ul", "`l'.`var'"], tags(year[.m] var[`l'.`var'])
    }
    collect style cell result[fvpercent _r_lb _r_ub], nformat(%3.1f)
    collect composite define _r_ci = _r_lb _r_ub, trim
    collect style cell result[_r_lb _r_ub], sformat("%s%%")
    collect style cell result[_r_ci], sformat("[%s]")
    collect style autolevels result fvfrequency fvpercent _r_ci, clear
    collect layout
}
collect combine all= `vars'
collect layout (var) (year#result)

Res.:

Code:

. collect layout (var) (year#result)

Collection: all
      Rows: var
   Columns: year#result
   Table 1: 9 x 12

------------------------------------------------------------------------------------------------------------------------------------------------
                                                                                Interview year                                                  
                                          68                           69                           70                          Total           
------------------------------------------------------------------------------------------------------------------------------------------------
1 if central city                                                                                                                               
  0                          24,478 (57.6%) [54.8% 60.4%] 22,196 (57.6%) [54.6% 60.5%] 29,569 (58.2%) [55.7% 60.7%] 76,243 (57.8%) [56.2% 59.4%]
  1                          18,018 (42.4%) [39.6% 45.2%] 16,327 (42.4%) [39.5% 45.4%] 21,224 (41.8%) [39.3% 44.3%] 55,569 (42.2%) [40.6% 43.8%]
1 if married, spouse present                                                                                                                    
  0                          22,015 (54.0%) [51.1% 56.8%] 20,506 (53.6%) [50.6% 56.5%] 24,835 (48.4%) [45.8% 51.0%] 67,356 (51.7%) [50.0% 53.3%]
  1                          18,772 (46.0%) [43.2% 48.9%] 17,778 (46.4%) [43.5% 49.4%] 26,487 (51.6%) [49.0% 54.2%] 63,037 (48.3%) [46.7% 50.0%]
1 if never married                                                                                                                              
  0                               0    (.%)               21,535 (56.3%) [53.3% 59.2%] 31,384 (61.2%) [58.6% 63.6%] 52,919 (59.1%) [57.1% 61.0%]
  1                               0    (.%)               16,749 (43.7%) [40.8% 46.7%] 19,938 (38.8%) [36.4% 41.4%] 36,687 (40.9%) [39.0% 42.9%]
------------------------------------------------------------------------------------------------------------------------------------------------

Comment

Markus Stratmann

Join Date: May 2025

Posts: 8
#15

27 May 2025, 07:03

Thank you so much!
Comment

Announcement