Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Create a Descriptive Statistics Table by Category Using Dummies

    Sure! Here's a concise and clear version of your question in English, suitable for posting in a Stata forum:
    Hi everyone,

    I'm trying to create a descriptive statistics table for the variable unit_cost, broken down by five mutually exclusive categories defined by dummy variables:
    • cat_constru = 1 → New Construction
    • cat_rehab = 1 → Rehabilitation
    • cat_up = 1 → Upgrading
    • cat_maint = 1 → Maintenance
    • cat_na = 1 → N/A
    I want the table to display the following statistics for each category:
    • Mean
    • Standard deviation
    • Minimum
    • Maximum
    • 25th percentile
    • Median (p50)
    • 75th percentile
    • Number of observations
    The goal is to replicate a clean, publication-style table similar to those seen in academic papers, with categories in the rows and statistics in the columns. I'd like to generate it using tabstat and esttab if possible.

    Also, I’d like to include a "Total" row at the bottom showing the overall statistics for unit_cost.

    Any help on how to do this properly would be greatly appreciated!

    Thanks in advance.

  • #2
    Code:
    . // create some example data
    . clear all
    
    . set obs 1000
    Number of observations (_N) was 0, now 1,000.
    
    .
    . gen u = runiform()
    
    . gen byte cat_constru:yesno_lb = u < 0.2
    
    . gen byte cat_rehab:yesno_lb   = u >= 0.2 & u < 0.4   
    
    . gen byte cat_up:yesno_lb      = u >= 0.4 & u < 0.6      
    
    . gen byte cat_maint:yesno_lb   = u >= 0.6 & u < 0.8   
    
    . gen byte cat_na:yesno_lb      = u >= 0.8      
    
    . drop u
    
    .
    . label var cat_constru "New Construction"
    
    . label var cat_rehab   "Rehabilitation"
    
    . label var cat_up      "Upgrading"
    
    . label var cat_maint   "Maintenance"
    
    . label var cat_na      "N/A"
    
    .
    . label define yesno_lb 0 "no" 1 "yes"
    
    .
    . gen unit_cost = rnormal()
    
    .
    . // -------------------------------------------------------
    . // now the real work begins
    .
    . // get the indicators in one categorical variable
    . gen byte cat:cat_lb = 1*cat_constru + ///
    >                       2*cat_rehab   + ///
    >                       3*cat_up      + ///
    >                       4*cat_maint   + ///
    >                       5*cat_na      
    
    .                       
    . label define cat_lb 1 "New Construction" ///
    >                     2 "Rehabilitation" ///
    >                     3 "Upgrading" ///
    >                     4 "Maintenance" ///
    >                     5 "N/A"
    
    . label variable cat "type of construction"
    
    .
    . // now we create the table
    . table (cat) if !missing(unit_cost) ///
    >            , stat(mean unit_cost)  ///
    >              stat(sd unit_cost)    ///
    >              stat(min unit_cost)   ///
    >              stat(p25 unit_cost)   ///
    >              stat(p50 unit_cost)   ///
    >              stat(p75 unit_cost)   ///
    >              stat(max unit_cost)   ///
    >              stat(freq)            ///
    >              nformat(%9.2f)        ///
    >              nformat(%9.0f frequency)
    
    ----------------------------------------------------------------------------------------------------------------------------------------------------
                         |   Mean   Standard deviation   Minimum value   25th percentile   50th percentile   75th percentile   Maximum value   Frequency
    ---------------------+------------------------------------------------------------------------------------------------------------------------------
    type of construction |                                                                                                                              
      New Construction   |   0.01                 0.97           -2.60             -0.59             -0.03              0.64            3.24         200
      Rehabilitation     |  -0.02                 1.08           -2.49             -0.82             -0.05              0.76            2.62         202
      Upgrading          |   0.04                 1.08           -2.88             -0.66             -0.03              0.78            3.20         195
      Maintenance        |   0.00                 1.03           -2.64             -0.76              0.04              0.69            3.48         192
      N/A                |   0.01                 1.12           -2.98             -0.73             -0.04              0.71            2.91         211
      Total              |   0.01                 1.06           -2.98             -0.72             -0.01              0.71            3.48        1000
    ----------------------------------------------------------------------------------------------------------------------------------------------------
    
    .
    . // the column labels are a bit long             
    . collect label levels result sd        "sd"         ///
    >                             frequency "N"          ///
    >                             max       "Max"        ///
    >                             min       "Min"        ///
    >                             p25       "25th perc." ///
    >                             p50       "median"     ///
    >                             p75       "75th perc." , modify
    
    .
    . // admire the result                                                    
    . collect preview         
    
    ---------------------------------------------------------------------------------------------
                         |   Mean     sd     Min   25th perc.   median   75th perc.    Max      N
    ---------------------+-----------------------------------------------------------------------
    type of construction |                                                                       
      New Construction   |   0.01   0.97   -2.60        -0.59    -0.03         0.64   3.24    200
      Rehabilitation     |  -0.02   1.08   -2.49        -0.82    -0.05         0.76   2.62    202
      Upgrading          |   0.04   1.08   -2.88        -0.66    -0.03         0.78   3.20    195
      Maintenance        |   0.00   1.03   -2.64        -0.76     0.04         0.69   3.48    192
      N/A                |   0.01   1.12   -2.98        -0.73    -0.04         0.71   2.91    211
      Total              |   0.01   1.06   -2.98        -0.72    -0.01         0.71   3.48   1000
    ---------------------------------------------------------------------------------------------
    
    .
    . // we can now use -collect export- to export the table
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment

    Working...
    X