Generate dataset w/ averages, upper, and lower variation

Dakota McAvoy

Join Date: Aug 2019

Posts: 38
#1

Generate dataset w/ averages, upper, and lower variation

11 Aug 2022, 15:24

Say I want to estimate the prevalence of something across multiple weeks, and then also create an "upper" and "lower" bound where I add and subtract the SE to that average. I have the starts of two solutions but neither one is complete.

Option 1: collapse

Code:

collapse(mean) mean_score = score, by(week)

This gets me the mean test score, but not an SE.

Option 2: margins

Code:

eststo margin: margins, over(week) post

And then somehow extract the -Margin- and SE, but not output them using -estout- (or your package of choice), I'd want to keep it in Stata and then add and subtract the SE to create new upper and lower bound variables, and then export the margin, lower, and upper variables.

Or perhaps there's some entirely easier way to do all this.

Does this make any sense? Any thoughts on the most efficient way to do this?
Tags: None
Daniel Schaefer

Join Date: Mar 2020

Posts: 811
#2

11 Aug 2022, 15:28

You might take advantage of data frames for this. Check out example 4 here: https://www.stata.com/features/overv...ets-in-memory/
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35466
#3

11 Aug 2022, 15:29

collapse calculates various standard errors (choose which suits your case). So collapse mean and se at once, and then calculate mean + se or whatever else.

A related approach is to use statsby and ci to generate a dataset of confidence intervals
Comment
Dakota McAvoy

Join Date: Aug 2019

Posts: 38
#4

11 Aug 2022, 15:34

Thanks both. For anyone checking this in the future, the appropriate code is just

Code:

collapse(mean) mean_score = score (semean) se_score = score, by(week)

Last edited by Dakota McAvoy; 11 Aug 2022, 15:36.
Comment
Dakota McAvoy

Join Date: Aug 2019

Posts: 38
#5

11 Aug 2022, 15:40

Though I'll add, if anyone has a lead on how to essentially extract the exact output that -margins- provides and create a new dataframe with that, I'd be very curious to learn.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 29966
#6

11 Aug 2022, 16:07

If you run -margins- with the -post- option, it will leave behind a matrix, r(table)- in r(). You can pick that matrix up as a real matrix and then use -svmat- to turn that into a Stata data set.

The main difficulty is that if your regression uses any factor variable notation (and if you are using -margins- it usually will), the column names of r(table) are not legal variable names. So the variable names you get from -svmat- are not very informative: M1, M2, etc. You will then need to think about how you want to encode the information of the matrix column names in legal Stata variable names and write some code to do the needed renaming. Sometimes that gets ugly.
1 like
Comment
Dakota McAvoy

Join Date: Aug 2019

Posts: 38
#7

11 Aug 2022, 16:27

Thanks Clyde.
Comment

Announcement

Generate dataset w/ averages, upper, and lower variation

Comment

Comment

Comment

Comment

Comment

Comment