Hi all,
I have a large census data set (12 million observations) that I want to generate summary statistics for. I have a range of socioeconomic characteristics that I want to have generated but I can't seem to get for loops to work in order to run through all observations. Essentially, I want to have a table that generates 4 columns (but the code below only shows for three races, White,Black and Chinese) that has all the characteristics I want as new rows.
Below is an excerpt of my code with just a few of the socioeconomic characteristics I want to generate (age, emp_dummy(percentage of a specific race that is employed), foodstmp_dummy(percentage on food stamps), hsorless (those with a high school degree or less).
Race=1 for white, Race==4 for Chinese, Race=2 for Black
Code I have tried so far that hasn't worked (no error, but no .tex file is generated):
If I try to run using the "brute force" method, where I run each variable separately for each race category the code works and looks like:
One .tex file is generated at the end of running all of this, but I can't seem to get one generated if I try to use loops instead of brute force method.
Thanks for any help you might be able to give!
I have a large census data set (12 million observations) that I want to generate summary statistics for. I have a range of socioeconomic characteristics that I want to have generated but I can't seem to get for loops to work in order to run through all observations. Essentially, I want to have a table that generates 4 columns (but the code below only shows for three races, White,Black and Chinese) that has all the characteristics I want as new rows.
Below is an excerpt of my code with just a few of the socioeconomic characteristics I want to generate (age, emp_dummy(percentage of a specific race that is employed), foodstmp_dummy(percentage on food stamps), hsorless (those with a high school degree or less).
Race=1 for white, Race==4 for Chinese, Race=2 for Black
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input byte(region race) int age float(emp_dummy foodstmp_dummy hsorless) 32 1 49 1 1 1 32 1 59 0 1 0 32 1 37 0 1 0 32 1 15 . 1 . 32 1 14 . 1 . 32 1 34 0 1 1 32 1 51 1 0 1 32 1 53 1 0 1 32 1 51 1 0 1 32 1 48 1 0 0 32 1 49 1 1 1 32 1 38 1 1 1 32 1 21 . 1 . 32 1 18 . 1 . 32 1 4 . 1 . 32 1 3 . 1 . 32 1 2 . 1 . 32 1 0 . 1 . 32 1 40 1 0 0 32 4 35 0 0 0 32 1 10 . 0 . 32 4 2 . 0 . 32 1 66 . 0 . 32 1 28 0 0 0 32 2 80 . 0 . 32 2 48 1 1 1 32 2 15 . 1 . 32 2 19 . 1 . 32 2 19 . 0 . 32 1 67 . 0 . 32 1 67 . 0 . 32 1 43 1 0 1 32 1 41 1 0 0 32 1 77 . 0 . 32 1 55 1 0 0 32 1 45 0 0 0 32 2 56 0 0 1 32 1 73 . 0 . 32 1 78 . 0 . 32 1 28 1 0 1 32 1 24 . 0 . 32 1 0 . 0 . 32 1 5 . 0 . 32 1 32 1 0 0 32 1 31 1 0 0 32 1 0 . 0 . 32 1 0 . 0 . 32 1 28 1 0 0 32 1 26 1 0 0 32 1 28 1 0 1 32 1 0 . 0 . 32 1 29 1 0 0 32 1 30 1 0 0 32 1 34 1 0 1 32 1 77 . 0 . 32 1 46 1 0 1 32 1 51 1 0 1 32 1 18 . 0 . 32 1 61 0 0 1 32 1 60 0 0 1 32 1 28 0 0 0 32 1 28 0 0 0 32 1 3 . 0 . 32 1 1 . 0 . 32 2 50 0 0 1 32 1 87 . 0 . 32 1 25 1 0 1 32 1 27 0 0 0 32 1 45 1 0 0 32 1 44 0 0 0 32 1 13 . 0 . 32 1 11 . 0 . 32 1 76 . 0 . 32 1 84 . 0 . 32 1 76 . 0 . 32 1 40 1 0 0 32 1 44 1 0 0 32 1 14 . 0 . 32 1 56 1 0 0 32 1 52 0 0 0 32 1 25 1 0 0 32 1 22 . 0 . 32 1 34 0 0 1 32 1 9 . 0 . 32 1 7 . 0 . 32 1 31 1 0 1 32 1 66 . 0 . 32 1 21 . 0 . 32 1 78 . 0 . 32 1 73 . 0 . 32 1 75 . 0 . 32 1 58 1 0 1 32 1 64 1 0 1 32 1 64 0 0 1 32 1 57 0 1 1 32 1 31 1 0 1 32 2 20 . 0 . 32 1 45 1 0 0 32 1 46 1 0 0 32 1 18 . 0 . end
Code:
global output "C:\Users\User\Documents\Stata Code" local sumstats age emp_dummy foodstmp_dummy hsorless local race white black chinese #delimit ; foreach r of local race{; foreach s of local sumstats{; eststo white: estpost tabstat s, stat(mean sd) columns(stat) listwise; esttab using "$output/summarystats_overallavg_test.tex", replace label title ("Table 1: ACS Summary Statistics Across Races,") addnotes("The means are reported with the standard deviations in parenthesis. Significance stars: * 0.05 ** 0.01 *** 0.001") cells(mean(pattern(1 1 1 1) fmt(3)) sd(par pattern(1 1 1 1)fmt(2)) b(star pattern(0 0 1) fmt(2))) mtitles( "White" "Black" "Chinese") varlabel( emp_dummy "Percentage employed (Ages 25-65)" hsorless "Percentage with only a high school degree or less" age "Mean age" foodstmp_dummy "Percentage on Food Stamps") ; }; #delimit cr
Code:
eststo white: estpost tabstat age `sumstats' if race ==1, stat(mean sd) columns(stat) listwise; esttab using "$output/summarystats_overallavg_test2.tex", replace label title ("Table 1: ACS Summary Statistics Across Races, South Region") addnotes("The means are reported with the standard deviations in parenthesis. Significance stars: * 0.05 ** 0.01 *** 0.001") cells(mean(pattern(1 1 1 1) fmt(3)) sd(par pattern(1 1 1 1)fmt(2)) b(star pattern(0 0 1) fmt(2))) mtitles( "White" "Black" "Chinese") ; eststo black: estpost tabstat age `sumstats' if race ==2, stat(mean sd) columns(stat) listwise; esttab using "$output/summarystats_overallavg_test2.tex", append label ........ ........
Thanks for any help you might be able to give!
Comment