Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loosing precision in Y axis (large number)

    Good morning,

    I am generating a very simple histogram on number of days patients stay in clinic however since my data set is pretty big the Y axis numbers are not fully display. example 2.0e+04 instead of an whole number. I would like the whole number and perhaps can be oriented horizontal.


    histogram N_days, width(30) frequency

    Thank you,
    Marvin

  • #2
    Marvin: "Good morning" makes assumptions inconsistent with the sphericity of the earth and the habits and habitats of members here.

    2,0e+04 is an integer. It's 20000. So there's no loss of precision.

    But we know what you mean.

    Code:
     
    yla(, format("%1.0f") ang(h))
    as an extra option should suffice. This is documented, You just need to click your way to

    Code:
     
    help axis_label_options

    Comment


    • #3
      Hi Nick,

      I wont use good morning, afternoon , etc anymore in my future post, I promise! Thank you very much!

      Marvin

      Comment


      • #4
        Hi Nick,

        One more thing. Now I am doing my histogram by another variable. two groups of people. However, one group (men) account for 90% of all the observations so hen I select the frequency options, my women histogram barely can be seen. I would like to have different Y axis values for each histogram. Is this possible or I would have to create two different histograms and then combine them (gr combine) ?

        2. Also Is there a way to include the total number of observation for each group in an automatic way or I have to write the number in the title for example?
        3. Lastly, How can I included the mean and perhaps standard deviations? Can this be add automatically or I have to add this manually?


        histogram N_months , discrete width(1) frequency binrescale by(N_UOF_CAT)
        yla(, format("%1.0f") ang(h))


        the majority of my

        Comment


        • #5
          The -by- option in histogram takes the options of a twoway graph (see help twoway_options). Therefore, you can rescale the y (or x) axis for each graph:

          Code:
          histogram N_months , discrete width(1) frequency binrescale by(N_UOF_CAT, yrescale) yla(, format("%1.0f") ang(h))
          I don't see an option to automatically add the statistics you want, but you can certainly run sum prior to graphing and use the returned values in r(N), r(mean), and r(sd) as macros in a graph caption, note, or title.
          Stata/MP 14.1 (64-bit x86-64)
          Revision 19 May 2016
          Win 8.1

          Comment


          • #6
            Hi Carole,

            Thank you so much!

            Can yo help me do that? that is:

            "I don't see an option to automatically add the statistics you want, but you can certainly run sum prior to graphing and use the returned values in r(N), r(mean), and r(sd) as macros in a graph caption, note, or title."

            the mean and total n of each subgroup chart.

            Also, Is there a way to have the frequency and percentage in a histogram?

            Is there a way to show the X bar value. For example in this case the number of days?

            Finally, not all X axis is showing in my histogram I guess because I have too many sub-charts. Is there a way to force stat to include a X axis in all charts?

            Thank you so much! I would appreciate any help. If you can asnwer one or two of my request I would greatly appreciate.

            Best,
            Marvin

            Comment


            • #7
              For the first, this loop sums N_months for each value of your by variable, collects the stats, stores them in `group', then uses that to display in the graph caption. Adjust the number format (e.g. %10.1f) as necessary for your situation. Note that the caption for the overall graph is specified inside the by option. If you specify the caption outside, the same caption will be repeated for each graph.

              Code:
              levelsof N_UOF_CAT, local(lev)
              foreach i of local lev {
                  sum N_months if N_UOF_CAT==`i'
                  local mn=trim("`: display %10.2f r(mean)'")
                  local n=trim("`: display %10.0f r(N)'")
                  local sd=trim("`: display %10.2f r(sd)'")
                  local group   `"`group'  "'`"Stats if varname==`i': N=`n', Mean=`mn', Std=`sd' "'
                  }
              
              histogram N_months , discrete width(1) frequency binrescale by(N_UOF_CAT, caption(`"`group'"') yrescale) yla(, format("%1.0f") ang(h))
              ​

              For the second: I don't understand what you mean by having both frequency and percentage in a histogram. I would think not. If you want to overlapping bars, you'll have to move to either graph bar or some twoway graph.

              For the third (show the x bar value): do you mean on the x axis or on/above the bar? If on the x-axis, see below.

              Finally, to make adjustment to the x-axis, you should read up on help axis_label_options (remember, histogram takes twoway options).

              If your N_months ranges in whole numbers from 0-100, you can show all of those values (and only those values) by specifying a numlist in the xlabel option. If the variable is labeled in some way, you can include the suboption valuelabel.
              Code:
              histogram N_months , discrete width(1) frequency binrescale by(N_UOF_CAT, caption(`"`group'"') yrescale) yla(, format("%1.0f") ang(h)) xlabel(0(1)10, valuelabel)
              Last edited by Carole J. Wilson; 22 Apr 2016, 10:54.
              Stata/MP 14.1 (64-bit x86-64)
              Revision 19 May 2016
              Win 8.1

              Comment


              • #8
                I think I misunderstood one of your questions. When using the by option in a graph, by default Stata will the display only the x-axes on those graphs on the bottom row and only the y-axes of those in the first column. To change that, use the suboption(s) ixaxes and/or iyaxes within the by option (see help by_option):


                Code:
                histogram N_months , discrete width(1) frequency binrescale ///
                by(N_UOF_CAT, caption(`"`group'"')yrescale ixaxes) ///
                yla(, format("%1.0f") ang(h)) xlabel(0(1)10, valuelabel)
                Stata/MP 14.1 (64-bit x86-64)
                Revision 19 May 2016
                Win 8.1

                Comment


                • #9
                  Carole!! you are the best!

                  One last thing, I promise!

                  I have an age and a dummy variable with yes an no categories for whether the patient was involved in an violent incident. I would like to have a stacked bar chart. So the X axis is the age and one bar for each age with the percentage of Yes an d No (involved in incidents). Is there a simple way to do this? I have done some reading an it seems that it is not that simple.

                  Comment


                  • #10
                    I personally don't like working with the graph bar command, I always get confused by it. Nick Cox has some nice alternatives to stacked bar graphs and other ways to create stacked bar graphs that he has posted on the forum in the last few months, so you may want to try those if you need any special options not available in the graph bar command. Here's one way to create what you want. It takes advantage of the fact that the mean of a binary variable coded 0 or 100 is the same as the percentage of 1's in a dummy variable.

                    Code:
                    gen involved_yes=0
                    replace involved_yes = 100 if varname==1   // where varname==1 is whatever your involved variable is for YES
                    gen involved_no=0  
                    replace involved_no = 100 if varname==0   // where varname==0 is whatever your involved variable is for NO
                    
                    gr bar involved_yes involved_no, over(age) stack legend( label( 1 "Yes" ) label( 2 "No" )) percent
                    *actually, if you use the -percent- option, you can code involved_yes & involved_no as both 0/1 since the option changes the y-axis for you.
                    Last edited by Carole J. Wilson; 22 Apr 2016, 12:29.
                    Stata/MP 14.1 (64-bit x86-64)
                    Revision 19 May 2016
                    Win 8.1

                    Comment


                    • #11
                      Percents of yes and no are no more informative than percents of yes. One bar will just be the complement of the other.

                      Age is (likely to be proxy for) a continuous variable.

                      Consider some parallel to

                      Code:
                      sysuse auto, clear
                      egen mean = mean(100 * foreign), by(rep78)
                      twoway connected mean rep78, sort yla(0(25)100, ang(h)) ytitle(% foreign)
                      What could be more direct?

                      Comment


                      • #12
                        Carole: Thank you so much! There should be a system way to do stacked charts easily but I guess there are not.
                        Nick: I agree with you but sometimes it is easier for non-technical audience to understand stacked charts than just showing the "yes". I like you chart though

                        Comment


                        • #13
                          Carole J. Wilson I am sorry to go back to this post. See the graph attached. I do not want to see the legend at the bottom of my graph. I use the legend(off) commands but the legend still appear in my graph. This is the syntax I used to generate the graph: histogram preUOF, width(1) frequency by(group,col(1)) legend(off) addlabopts(mlabsize(small))

                          Any ideas?
                          Attached Files

                          Comment


                          • #14
                            If you look at the help file for histogram, you see that the by, title, and legend options are the same as in twoway. Whenever something doesn't seem to work right, and you are using a by() option, then the problem usually lies in some incorrectly specified by() option. Indeed, this is the case here. If you look at help by_option, you see that you control the legend from within the by() option. So, the following should work as you expect:

                            Code:
                            histogram preUOF, width(1) frequency by(group,col(1) legend(off) ) addlabopts(mlabsize(small))
                            Stata/MP 14.1 (64-bit x86-64)
                            Revision 19 May 2016
                            Win 8.1

                            Comment


                            • #15
                              See also tabplot (SSC). http://www.statalist.org/forums/foru...updated-on-ssc

                              A tabplot might look something like this:

                              Click image for larger version

Name:	marvin.png
Views:	1
Size:	32.9 KB
ID:	1339272

                              Comment

                              Working...
                              X