No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Add frequency column to proportion command results?

    Hi there. This is a (seemingly) simple task but is giving me some grief! Hoping someone can help.

    I have a number of categorical variables I want to simply summarize into frequencies(proportions) in a table. Ideally I could get this all within the same output/results so that they are easy to copy/align to excel in one go (and then subsequently link to a clean table). I need to extract the proportions AND frequencies from the output - but seem to be able only do one OR the other using separate commands (i.e. via using either proportion or tab - see below). Problem with doing both separately is it means different formats of result to copy/extract (so they don't align well when pasting into excel).

    proportion familyhistory_3cat educ_cat3 ethn3 occup_3 marital_cat2 alcohol_cat smokstatus_2 bmigrp_12 ///
    if subpopulation==1

    foreach var of varlist familyhistory_3cat educ_cat3 ethn3 occup_3 marital_cat2 alcohol_cat smokstatus_2 bmigrp_12 {
    tab `var' if subpopulation==1

    I would like to look at these in a subpopulation (as above), but then break down further by gender (male/female), if possible.

    The easiest solution would be if I could somehow run the first proportion command above (which allows me to add multiple variables at once and they come out in a cleanly aligned table), but then append another column to this output that includes frequencies as well - is this possible? Of course put excel is another option - but surely this can be done more simply?

    Hope that was clear. Many thanks!

    Last edited by patrick handcock; 15 Sep 2018, 11:35.

  • #2
    Alternatively, if there was some way of using the second tab command above to display all the variables in one large matrix - that would also work (as it has frequency and percent). The annoying thing is that it wants to put out individual tables for each variable - would be wonderful to be able to output these all in one long matrix that can then be copied or worked with in excel.


    • #3
      I am not quit sure that I understand what you actually is looking for, but could the -estpost tabulate- command solve the problem:

      sysuse auto
      estpost tabulate rep78
      matrix freq = e(b)
      matrix pct = e(pct)
      matrix matrix_rep78 = freq \ pct
      matrix matrix_rep78_v = matrix_rep78'
      matrix list matrix_rep78_v
      You will still need the foreach-loop and some naming, but you should get the information from the tab1 that you need?


      • #4
        Estpost is from ssc.


        • #5
          Hi there @Dennis Lund Hansen - thanks for your reply and suggested code.

          I could have been clearer, so apologies. Tab1 alone does provide frequencies for the subsequent computation of %, but if you try to run multiple variables you get lots of different sized tabs in Stata output - so can't just copy the whole section of tabs into excel and then apply formulas to calculate %'s using the frequencies. Thus, I was trying to find away to get the frequencies AND %'s for EACH variable all at once (so could copy it all into excel) - ideally in one long table.

          I guess one could potentially also use the proportion function - and get frequencies from it?

          Basically the end goal is something like this (but quickly and with multiple variables):

          Parental history of diabetes, n (%) 968 (28.3)
          Education, n (%)
          High school or less 1,082 (31.9)
          Technical/vocational 1,488 (43.1)
          Bachelor’s degree or higher 859 (25.0)
          Ethnicity, n (%)
          Australia/New Zealand 2,725 (79.4)
          Other English speaking 411 (12.0)
          Other 293 (8.6)
          Occupation, n (%)
          Professional/managerial 1,180 (34.2)
          Blue collar 408 (11.9)
          White collar/administrative 693 (20.2)
          Not currently working 1,148 (33.8)

          ...using this:

          foreach var of varlist var1 var2 var3 var4 var5 {
          tab `var' if sample==1

          I wasn't quite sure how to implement my loop with your suggested code using estpost - could you advise how this might look?

          Many thanks again! Realise that putexcel function might be a better long-term option for this - but keen to find a quick method in the first instance.


          Last edited by patrick handcock; 19 Sep 2018, 08:36.


          • #6
            It could be done in something like this way:

            sysuse auto
            foreach v of varlist         ///
                            price         ///
                            headroom     ///
                            length      ///
                            gear_ratio    ///
                            rep78 {
                estpost tabulate `v' if foreign==1
                matrix freq = e(b)
                matrix pct = e(pct)
                matrix mtx_`v' = freq \ pct
                matrix mtx_`v'_long = mtx_`v''
            matrix dir
            You can probably speed up the export process to excel if you can incorporate the " putexcel ul_cell = matrix(name) " in the loop. Or you could use the -svmat- and -postfile- commands and create a dataset from the matrices before exporting it.


            • #7
              Thanks so much Dennis - I will have a play!