Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • descriptive stats on categorical variables in STATA

    I am creating a descriptive stats table and I need guidance on using categorical variables. When I run the descriptive stats for categorical variables, such as "summarize i.educ" (categorical education variable) I only get back info for some categories, with the comparison category suppressed. For instance, if i.educ categories are less than hs, hs, some college, college, post grad and I run "summarize i.edu" I only get summary data on hs, some college, college, post grad with "less than hs" not reported. If I run "summarize educ" vs. i.educ I get an inaccurate numerical response that does not recognize the categorical nature of the variable.
    How do I get to see the summary stats on all categories?

  • #2
    I think you're using the wrong tool for this job. With categorical variables, summarizing the individual indicator variables is not very informative: you already know that the min and the max are 0 and 1, respectively. You already know that the mean is the proportion of 1 values for that level, and the standard deviation is sqrt((mean*(1-mean)). A more informative way to give descriptions of categorical variables is to -tab- them (and not using factor-variable notation). That will give you the N and % in each level of the category variable--which is a complete description of its distribution. And no reference categories will be left out. So just -tab educ- is your best bet here.

    Comment


    • #3
      Clyde thanks for your response. Is this relevant when there are more than 2 categories such as 5 or more?

      Comment


      • #4
        Absolutely. Why not try it out?

        Comment


        • #5
          As a sidelight, I am a big fan of Ben Jann's -fre- command, available from SSC. The output looks a little nicer than what you get from -tab-, especially if you have value labels for the variable.

          If you have some compelling reason to use sum (and I don't know what it would), you could say

          Code:
          sum ibn.educ
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          Stata Version: 17.0 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Despite being an old thread, this was really helpful. Richard, the ibn.varname was exactly what I was looking for. Thanks.

            Comment


            • #7
              Originally posted by Richard Williams View Post
              As a sidelight, I am a big fan of Ben Jann's -fre- command, available from SSC. The output looks a little nicer than what you get from -tab-, especially if you have value labels for the variable.

              If you have some compelling reason to use sum (and I don't know what it would), you could say

              Code:
              sum ibn.educ
              Thank you! I was looking for this exact solution.
              However, I have some challenges with exporting the summary table. I am using asdoc function and specifying ibn.var, but I have an error stating "factor-variable operators not allowed".
              What can be potentially wrong?
              Regards,
              Orif
              Using Stata 16/MP

              Comment


              • #8
                Originally posted by Orifjon Kurbanov View Post

                Thank you! I was looking for this exact solution.
                However, I have some challenges with exporting the summary table. I am using asdoc function and specifying ibn.var, but I have an error stating "factor-variable operators not allowed".
                What can be potentially wrong?

                I am getting the same error, can someone please help. Thanks

                Comment


                • #9
                  #7 is specifically about the asdoc command (not function) from SSC. Note that no-one answered it -- I guess because it is buried under a rather general thread title.

                  #8 as the same question is not more likely to get an answer.

                  But this kind of error often arises just because you didn't put a comma before your options.

                  If that's not the answer I would start a new thread with a title like "factor variable operators not allowed in asdoc". but do give the command you typed. See also https://www.statalist.org/forums/help#stata for advice on what goes in a good question.

                  Comment


                  • #10
                    Nick Cox is right, this needed a separate post. I found the mention of asdoc only through search. I have updated asdoc to support factor variables with the sum command. At the moment, asdoc supports factor variables only in the simple summary statistics. The detailed summary statistics with factor variables is on the to-do list. The new version of asdoc can be installed from my site. Copy and paste the following line in Stata and press enter.
                    Code:
                    net install asdoc, from(http://fintechprofessor.com) replace
                    Please note that the above line has to be copied in full. After installation of the new version, then restart Stata.

                    Here is an example. The abb(.) option is used to avoid abbreviation of lengthy names, or labels.

                    Code:
                    sysuse nlsw88, clear
                    asdoc sum ibn.industry i.race wage, replace abb(.)
                    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	55.0 KB
ID:	1568747



                    asdocx is now available
                    A more powerful and flexible version of asdoc is now available. I call it asdocx. You may like to check the details here

                    https://fintechprofessor.com/asdocx


                    Please do remember to cite asdoc. To cite:

                    In-text citation
                    Tables were created using asdoc, a Stata program written by Shah (2018).

                    Bibliography
                    Shah, A. (2018). ASDOC: Stata module to create high-quality tables in MS Word from Stata output. Statistical Software Components S458466, Boston College Department of Economics.
                    Regards
                    --------------------------------------------------
                    Attaullah Shah, PhD.
                    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
                    FinTechProfessor.com
                    https://asdocx.com
                    Check out my asdoc program, which sends outputs to MS Word.
                    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

                    Comment

                    Working...
                    X