Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • hbar with log scale

    I currently have the following hbar code:

    graph hbar asr, over(var1) over(var2) stack exclude0

    I would like however for variable 'asr' (xaxis) to be on a logscale for the graph. When I specify yscale(log) this does not work (all bars overlap and it is impossible to read). When I manually change 'asr' to be on the logscale using graph editor>>click on x axis>>scale>>use log scale I do not get the scale I want, with many bars looking the same. I've attached photos of the figure before the log and after I apply it. Could you please let me know how I can get the logscale to work?

    Thank you in advance for your help,
    Miranda
    before log after changing graph to log scale manually through graph editor

  • #2
    Please note the longstanding request at http://www.statalist.org/forums/help#stata to avoid photo attachments, which are more difficult to read.

    Note that for graph hbar, the y axis is the horizontal axis.

    My reading is that the code has a built-in assumption that bars will start at zero and that you will want to show that base. Despite your attempts to override the defaults, you have hit a wall. I have to sympathise with the code.

    I'd use graph dot instead.

    So much flak is likely from reviewers, readers, etc. for bars on a logarithmic scale and without a zero base that giving up the idea at this stage is recommended. A log scale does seem like a good idea for your data and it's totally compatible with a dot chart.
    Last edited by Nick Cox; 11 Mar 2016, 09:33.

    Comment


    • #3
      Thank you for your response and apologies for the photos - I thought it was the easiest way to view my issue but will refrain from using them in the future.

      I have the same issue when I replace dot in the place of hbar for the same code. Also, when I've added yscale to the code as so:

      graph dot asr, over(var1) over(var2) stack exclude0 yscale(log)

      the figure turns out differently - making var1 and var2 overlap so that you can not read the bars not labels when I add them [similar to the problem reported here: http://www.stata.com/statalist/archi.../msg00153.html ]. Is this still a bug or something I'm unable to override?

      Comment


      • #4
        Can you post the data for experiment? 12 x 3 it seems. See FAQ Advice #12 on using dataex and continue to obscure the categorical controls if that is prudent.

        Comment


        • #5
          As requested. Thank you again for your help, Miranda

          Originally posted by Nick Cox View Post
          Can you post the data for experiment? 12 x 3 it seems. See FAQ Advice #12 on using dataex and continue to obscure the categorical controls if that is prudent.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float order byte level float asr_f
          4 1  1.354412
          4 2  4.914071
          4 3  6.848401
          4 4 15.459574
          1 1 25.963284
          1 2 26.759825
          1 3  43.95725
          1 4  67.82971
          3 1 29.777046
          3 2 19.227333
          3 3 17.086096
          3 4  8.427315
          end
          label values order order
          label def order 1 "Ca1", modify
          label def order 3 "Ca2", modify
          label def order 4 "Ca3", modify
          label values level level
          label def level 1 "Low", modify
          label def level 2 "Medium", modify
          label def level 3 "High", modify
          label def level 4 "Very High", modify

          Comment


          • #6
            Thanks for the sandbox.

            I think you've triggered a bug in graph, but one that's easy to work around: just take logarithms first but fix the labels. There is a helper command mylabels on SSC, which must be installed for this to work.

            I have added some of my own prejudices. The default dotted grid lines don't in my experience always port well to other software. I go for very thin light {grey|gray} lines as a visible grid that is not too obtrusive.


            Code:
            gen log_asr_f = log(asr_f)
            * ssc inst mylabels is prerequisite
            mylabels 1 2 5 10 20 50, myscale(log(@)) local(labels)
            graph dot (asis) log_asr_f, over(level) over(order) yla(`labels') ///
            linetype(line) lines(lw(vvthin) lc(gs12)) marker(1, ms(Oh) msize(*1.5))
            Click image for larger version

Name:	miranda.png
Views:	1
Size:	10.4 KB
ID:	1330476


            Comment


            • #7
              Originally posted by Nick Cox View Post
              Thanks for the sandbox.

              I think you've triggered a bug in graph, but one that's easy to work around: just take logarithms first but fix the labels. There is a helper command mylabels on SSC, which must be installed for this to work.

              I have added some of my own prejudices. The default dotted grid lines don't in my experience always port well to other software. I go for very thin light {grey|gray} lines as a visible grid that is not too obtrusive.


              Code:
              gen log_asr_f = log(asr_f)
              * ssc inst mylabels is prerequisite
              mylabels 1 2 5 10 20 50, myscale(log(@)) local(labels)
              graph dot (asis) log_asr_f, over(level) over(order) yla(`labels') ///
              linetype(line) lines(lw(vvthin) lc(gs12)) marker(1, ms(Oh) msize(*1.5))
              [ATTACH=CONFIG]n1330476[/ATTACH]

              Although this works for most of my data, I do have some asr_f values that are less than 1 so when I do the log as you propose it in fact becomes negative. Do you have a solution for this to keep it positive? Furthermore, I would like to have asr_m on my graph on the negative side (using -asr_m or -log_asr_m). Is there a way to use mylabels for negatives as well? I've had a play with it but no success yet.

              Thank you for your help,
              Miranda

              Comment


              • #8
                Logarithms below 0 for values below 1 are in no sense a problem: Keeping them all positive is an alarming suggestion and perhaps implies that you should revise logarithms!

                mylabels will work fine with negative labels. You don't give examples of what you tried that didn't work. However, you can't show e.g. -1 on a logarithmic scale, which could be your difficulty.

                In short, negative logarithms are defined for values below 1 and above 0, but logarithms of negative values, although defined as complex numbers, are essentially undefined for statistical purposes.

                In any case, a dual logarithmic scale for males and females (if that is what you mean; you've not explained your data) would be highly problematic as the logarithm of 0 is undefined and such scales can't be placed back-to-back because neither has a back. There are scales on which that idea would work but a much, much better solution is just to use different marker symbols and plot males and females in the same space.
                Last edited by Nick Cox; 14 Mar 2016, 09:30.

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Logarithms below 0 for values below 1 are in no sense a problem: Keeping them all positive is an alarming suggestion and perhaps implies that you should revise logarithms!

                  mylabels will work fine with negative labels. You don't give examples of what you tried that didn't work. However, you can't show e.g. -1 on a logarithmic scale, which could be your difficulty.

                  In short, negative logarithms are defined for values below 1 and above 0, but logarithms of negative values, although defined as complex numbers, are essentially undefined for statistical purposes.

                  In any case, a dual logarithmic scale for males and females (if that is what you mean; you've not explained your data) would be highly problematic as the logarithm of 0 is undefined and such scales can't be placed back-to-back because neither has a back. There are scales on which that idea would work but a much, nuch better solution is just to use different marker symbols and plot males and females in the same space.

                  Right now my data (asr_m & asr_f) are showing gradients across low, medium, high, and very high levels for 30 cancers. The majority of cancers have very small asr_m and asr_f values (<5), whilst a few have asr_m/asr_f values greater than 60. I would like the scale to be on a log scale so that it is easier to the view the gradients for the cancers with smaller values (when using a normal, equally spaced scale it is not easy to view the gradient). However, using the actual log values for asr_m and asr_f makes it confusing to view/interpret the gradient as values can go negative. Is there any way to use my actual values for asr_m and asr_f whilst alternating the spacing on the scale so that it is similar to a log scale (i.e. more space for <5 and less for greater values similar to that observed in you image 3 posts up)?

                  Comment


                  • #10
                    Sorry, but I don't understand what new thing you are asking: what does alternating spacing mean?

                    Using mylabels you can decide exactly which (positive) values are to be shown. Nothing negative will ever be shown.

                    Usual forum guidelines apply, please: Show example data. Show code you tried. Show graphs and say why they are/are not what you want.

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      Sorry, but I don't understand what new thing you are asking: what does alternating spacing mean?

                      Using mylabels you can decide exactly which (positive) values are to be shown. Nothing negative will ever be shown.

                      Usual forum guidelines apply, please: Show example data. Show code you tried. Show graphs and say why they are/are not what you want.

                      Sample data:

                      Code:
                      * Example generated by -dataex-. To install: ssc install dataex
                      clear
                      input float cancer byte level float(asr_m asr_f)
                       1 4          0   67.82971
                       1 2          0  26.759825
                       1 1          0  25.963284
                       1 3          0   43.95725
                       2 1 -19.016777          0
                       2 4 -73.918816          0
                       2 2  -15.76385          0
                       2 3 -27.404636          0
                       8 2          0  4.4332685
                       8 3          0   6.046227
                       8 1          0   4.207021
                       8 4          0   7.952416
                      21 2  -.8454071   .8163553
                      21 3 -1.1452693   1.408773
                      21 1  -.2142478   .2325141
                      21 4 -1.6599944  1.6487027
                      24 4 -2.0529184  1.7334282
                      24 1  -.6108794   .4142609
                      24 3 -1.5452455  1.0627611
                      24 2  -.6651382   .3421712
                      25 2  -.4800311          0
                      25 1  -.3559546          0
                      25 4  -5.013893          0
                      25 3 -1.6643234          0
                      27 1  -.6605381   .3345237
                      27 3 -.13699527 .026802106
                      27 4 -.13703632  .01266802
                      27 2 -.21155193 .070402704
                      end
                      label values level level
                      label def level 1 "Low", modify
                      label def level 2 "Medium", modify
                      label def level 3 "High", modify
                      label def level 4 "Very High", modify
                      Graph I have made:
                      graph hbar asr_m asr_f, over(level, gap(0) label(labsize(tiny))) over(cancer, gap(*0.8) label(labsize(vsmall))) stack ///
                      blabel(bar, position(outside) gap(0) format(%9.1f) size(tiny) color(black)) graphregion(color(white)) legend(off) plotregion(margin(zero)) ///
                      ysize(10) ylab(-100 "100" -75 "75" -50 "50" -25 "25" 0 25 50 75 100)

                      Using this code it is not easy to see/read the gradients for the cancers with small asr_m/asr_f values. For this reason I would like to have the scale on a log where I show 0 2 5 10 20 40 80 with the distance from 0-10 on the scale being the same length as 10-80. Using the mylabels as you proposed works except for the negative log values where the gradient is confusing to interpret as the bar goes left (negative). For my research I want to show the dose response and thus having this negative bar will not work.

                      I hope this helps and is clearer.

                      Comment


                      • #12
                        I think I may have gotten around it by multiplying asr_m/asr_f by 100 before doing the log... like that all log values will be positive and the rate will just be per 10,000,000 rather than 100,000. That said, I have now used the code:

                        Code:
                        mylabels 0 2 5 10 25 100 1000 10000, myscale(log(@)) local(labels)
                        graph hbar (asis) log_m log_f, over(level, gap(0) label(labsize(tiny))) over(cancer gap(*1) label(labsize(small))) stack  yla(`labels') legend(off) scale(0.6) ysize(7)
                        log_m and log_f were calculated in the same manner, just as a last step I made the value negative for males to put them on the same plot. Would it be reasonable to then add the negative values through graph editor using the negative values for females. For example, mylabels generated that 2 was at .693147. Would it be reasonable to add the label 2 for males at -.693147. Or, is it possible that the axes would not be equal on each side of 0?

                        Comment


                        • #13
                          Miranda:

                          Thanks for the extended data example.

                          As before, you can't show 0 on a log scale, as log 0 is not defined. I already pointed this out in #8 but still you refer to showing 0 in #11 and #12. This isn't a matter of opinion, but a basic mathematical fact.

                          Logs being negative is not a problem in itself. It is if you combine logarithms with an inappropriate graph design.

                          In my view any back-to-back design is fundamentally flawed:

                          1. You use twice the space to show a given range, more or less, so there is less space in which to compare like with like (m with m, f with f).

                          2. Comparing unlike with unlike (m with f) is too difficult to do well. Without immense effort, you can't even distinguish between three qualitatively different situations m ~ f, m slightly more than f and vice versa.

                          In your case, the base of the bars is an arbitrary non-zero. Bad idea: this will puzzle your smarter readers and not help the others. Bars are not compatible with logarithmic scales at all, in my view.

                          So, I won't offer advice on how to do bars differently.

                          I'm assuming that any 0 on any asr means no incidence, so we do not need to plot that explicitly. I flipped your male rates back to positive and plot in the same space. Note that O and + work well together even when values are very similar. You made the graph larger but the labels much smaller; that just makes the graph too difficult to read.

                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input float cancer byte level float(asr_m asr_f)
                           1 4          0   67.82971
                           1 2          0  26.759825
                           1 1          0  25.963284
                           1 3          0   43.95725
                           2 1 -19.016777          0
                           2 4 -73.918816          0
                           2 2  -15.76385          0
                           2 3 -27.404636          0
                           8 2          0  4.4332685
                           8 3          0   6.046227
                           8 1          0   4.207021
                           8 4          0   7.952416
                          21 2  -.8454071   .8163553
                          21 3 -1.1452693   1.408773
                          21 1  -.2142478   .2325141
                          21 4 -1.6599944  1.6487027
                          24 4 -2.0529184  1.7334282
                          24 1  -.6108794   .4142609
                          24 3 -1.5452455  1.0627611
                          24 2  -.6651382   .3421712
                          25 2  -.4800311          0
                          25 1  -.3559546          0
                          25 4  -5.013893          0
                          25 3 -1.6643234          0
                          27 1  -.6605381   .3345237
                          27 3 -.13699527 .026802106
                          27 4 -.13703632  .01266802
                          27 2 -.21155193 .070402704
                          end
                          label values level level
                          label def level 1 "Low", modify
                          label def level 2 "Medium", modify
                          label def level 3 "High", modify
                          label def level 4 "Very High", modify
                          
                          replace asr_m = -asr_m
                          mvdecode asr*, mv(0)
                          gen log_asr_m = log(asr_m)
                          gen log_asr_f = log(asr_f)
                          
                          mylabels .01 .02 0.05 0.1 0.2 0.5 1 2 5 10 20 50, myscale(log(@)) local(labels)
                          
                          graph dot (asis) log_asr_m log_asr_f, over(level, gap(0)) over(cancer, gap(*0.8))  ///
                          graphregion(color(white)) legend(off) plotregion(margin(zero)) yla(`labels', labsize(*0.8)) ysc(r(. `=log(90)')) ///
                          ysize(10) linetype(line) lines(lc(gs12) lw(vvthin)) marker(1, ms(+) msize(*1.2) mc(blue)) marker(2, ms(Oh) msize(*1.2) mc(pink))

                          Click image for larger version

Name:	miranda2.png
Views:	1
Size:	21.3 KB
ID:	1330769

                          Comment

                          Working...
                          X