Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Export Formatted Summary Statistic Table, by 2 types of groups

    I'm trying to export/create a formatted table of summary statistitics for my data. I'll try to explain what I'd like in terms of the standard auto dataset:
    I want to group the data in 2 different ways. First, I subset by a categorical variable I've created, price2 (One category for each of 3 price intervals, and then one with all combined)
    Then, I'd like to group by foreign, with a category all observations combined as well.
    Then, I'd like to report the mean of a variety of variables, one of which would be a factor variable. (so if the factor variable is length_cat, I could have % short, % medium, % long) I'd also like to have the total number of observations N for each subcategory.

    So, as I imagine it, I'd have 4 tables stacked on each other:
    • Table 1 would be only for Cheap cars, with 3+ columns (I have more in my real dataset) -- one for foreign cars, one for domestic, and one for all combined
      • Each row would then have a variable, with the mean value of that variable within the subgroup
    • Table 2 would only be for Medium cars, with 3+ columns... (similarly organized)
    • Table 3 would only be for Expensive cars
    • Table 4 would be for all cars
    Here's what I've tried so far:

    Idea 1: This gets at the idea of what I want, but I have no idea how to export this in a nicely formatted way, similar to an esttab table.

    Code:
    sysuse auto, clear
    gen price2=.
    replace price2=1 if price <=4000
    replace price2=2 if price >4000 & price <=5000
    replace price2=3 if price > 5000
    label define price2_label 1 "Cheap" 2 "Medium" 3 "Expensive"
    label values price2 price2_label
    
    bysort price2 foreign: sum price mpg rep78 headroom trunk

    Also, this gives the observations for each variable in a column, repeated for each row.
    In my data, the number of observations for each variable is the same within a subgroup, so I'd like a single row for N at the bottom of each subgroup.

    When I try to use the meanonly option to achieve this:
    Code:
    bysort price2 foreign : sum price mpg rep78 headroom trunk , meanonly
    I get no statistics at all.

    I'll be doing this type of table a few different times, so I have preference for a method that automates the formatting of the table as well, but I suppose I'll take anything at this point.



    Idea 2: Replicate a table from the Tabout tutorial
    http://www.ianwatson.com.au/stata/tabout_tutorial.pdf

    On page 34 of the pdf, there is a three-way table with a setup similar to what I would like, with the code provided to replicate it.
    (The code is given on pages 36-37 of the pdf)

    I imagine there is a way to tweak the code to have it report the means of multiple variables instead of the count of just 1 factor variable, but I can't even get the provided code in the tutorial to compile in LaTeX correctly, let alone adjust it to my data. (I've made sure I have the top.tex and bot.tex files in the correct directory, as described in the tutorial)

    I'm not even sure if this method is feasible for what I'd like to report, but the structure of it looks closest to what I'd like to achieve.

    For what it's worth, below is the code for the relevant table:

    Code:
    * setup macros for loops
    levelsof race, local(levels)
    local numberlevels : word count `levels'
    local racelabels : value label race
    local counter = 0
    
    * begin looping through the values of the by category
    foreach l of local levels {
        local counter = `counter' + 1
        if `counter' == 1 {
        local vlabel : label `racelabels' `l'
        tabout industry union if race == `l' using "table.tex", replace ///
    f(0c) style(tex) font(bold) bt lines(none) h1(nil) ///
    cl1(2-4) h3("\midrule \textbf{Race: `vlabel'} \\") ///
    topf(top.tex) topstr(14cm)
    }
    else if `counter' == `numberlevels' {
    local vlabel : label `racelabels' `l'
    tabout industry union if race == `l' using "table15.tex", append ///
    f(0c) style(tex) font(bold) bt lines(none) h1(nil) h2(nil) ///
    h3("\midrule \textbf{Race: `vlabel'} \\") ///
    botf(bot.tex) botstr(nlsw88.dta)
    }
    else {
    local vlabel : label `racelabels' `l'
    tabout industry union if race == `l' using "table15.tex", append ///
    f(0c) style(tex) font(bold) bt lines(none) h1(nil) h2(nil) ///
    h3("\midrule \textbf{Race: `vlabel'} \\") ///
    botf(bot.tex) botstr(nlsw88.dta)
    }
    }
    This code creates 2 .tex files in my WD.
    When I run a Tex file with the following code, I get an error.

    My TeX code:
    Code:
    \documentclass{report}
    \usepackage{booktabs}
    \usepackage{tabularx}
    
    
    \begin{document}
    
    
    \appendix
    
    \input{table}
    \input{table15}
    
    \end{document}
    The error given is:
    File ended while scanning use of \TX@get@body. \input{table}
    Misplaced \noalign. \midrule


    So, 2 questions for this part:
    1) I can probably update this code to apply to my toy example above as well as my real data, but how would I tweak the code to get it to output the table I desire?
    2) Any idea why this LaTeX code won't compile? I've tried just using \input{table}, or just \input{table15}, with no luck to either.



    Idea 3:
    https://www.statalist.org/forums/for...ysort-putexcel

    tabstat doesn't seem to support factor variables or give summaries, and I'm not sure if putexcel can give me a formatted table like I'd prefer.


    Any help with this would be greatly appreciated! I've tried to do my research before posting, but if I've formatted the question poorly in any way, my apologies.
    If there are any clarifying questions about what I want/have tried, I'll be glad to answer.
    https://www.statalist.org/forums/forum/general-stata-discussion/general/1430503-exporting-summary-stats-generated-with-bysort-putexcel
    Last edited by TJ Hedin; 15 Mar 2019, 13:06.
Working...
X