Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summary statistics according to quartiles

    Dear users,

    I need to get summary statistics, such as the mean, of a variable according to quartiles of another variable.
    For instance, I have the GDP at municipality level (continuous variable), and I would like to have the mean of another variable (say investments) according to quartiles of GDP.
    I do not find any useful command.
    Thanks for your help.
    Adam

  • #2
    Adam:
    you may searching for something along the following lines:
    Code:
    use "C:\Program Files (x86)\Stata15\ado\base\a\auto.dta"
    .  xtile quart = price, nq(4)
    
    . bysort quart: tabstat mpg, stat(count mean sd p50 min max)
    
    -----------------------------------------------------------------------------------------------------------------------
    -> quart = 1
    
        variable |         N      mean        sd       p50       min       max
    -------------+------------------------------------------------------------
             mpg |        19  23.84211  5.025083        22        18        35
    --------------------------------------------------------------------------
    
    -----------------------------------------------------------------------------------------------------------------------
    -> quart = 2
    
        variable |         N      mean        sd       p50       min       max
    -------------+------------------------------------------------------------
             mpg |        18  23.33333  5.573044      21.5        17        35
    --------------------------------------------------------------------------
    
    -----------------------------------------------------------------------------------------------------------------------
    -> quart = 3
    
        variable |         N      mean        sd       p50       min       max
    -------------+------------------------------------------------------------
             mpg |        19        20  6.055301        18        14        41
    --------------------------------------------------------------------------
    
    -----------------------------------------------------------------------------------------------------------------------
    -> quart = 4
    
        variable |         N      mean        sd       p50       min       max
    -------------+------------------------------------------------------------
             mpg |        18  17.94444  4.658606      16.5        12        26
    --------------------------------------------------------------------------
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      As a tweak on Carlo's helpful answer I note that using the by() option is alternative to the by: prefix.

      Code:
      . sysuse auto, clear 
      (1978 Automobile Data)
      
      . xtile quart = price, nq(4)
      
      . tabstat mpg, stat(count mean sd p50 min max) by(quart) 
      
      Summary for variables: mpg
           by categories of: quart (4 quantiles of price)
      
         quart |         N      mean        sd       p50       min       max
      ---------+------------------------------------------------------------
             1 |        19  23.84211  5.025083        22        18        35
             2 |        18  23.33333  5.573044      21.5        17        35
             3 |        19        20  6.055301        18        14        41
             4 |        18  17.94444  4.658606      16.5        12        26
      ---------+------------------------------------------------------------
         Total |        74   21.2973  5.785503        20        12        41
      ----------------------------------------------------------------------

      Comment


      • #4
        Nick's solution is far more efficient than mine.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X