Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Formatting Bar graph

    Hi everyone,
    Is there any way to make this bar graph more readable? 'A', 'B', and 'C' have small percentages while D has a very high percentage, making A, B, and C too small to read.
    Is there any way to drop bar 'D' and expand the scale on the y-axis since I am primarily interested in the trend of A, B, and C? Or which other way can you advise?
    Thank you for the anticipated response.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(state1 TrendData America year)
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 2 2
    . 1 3 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 . 2
    . 1 . 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 . 2
    . 1 4 2
    . 1 1 2
    . 1 4 2
    . 1 4 2
    . 1 1 2
    . 1 3 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    1 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    1 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    1 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 3 2
    . 1 4 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    . 1 4 2
    . 1 4 2
    . 1 1 2
    . 1 4 2
    . 1 3 2
    . 1 4 2
    . 1 . 2
    . 1 4 2
    end
    label values America DevelopmentalCategories
    label def DevelopmentalCategories 1 "A", modify
    label def DevelopmentalCategories 2 "B", modify
    label def DevelopmentalCategories 3 "C", modify
    label def DevelopmentalCategories 4 "D", modify
    Click image for larger version

Name:	State1.png
Views:	0
Size:	0
ID:	1712958

    Last edited by Olubunmi Adebiyi; 09 May 2023, 10:30.

  • #2
    This is the bar graph

    Click image for larger version

Name:	State1.png
Views:	3
Size:	183.4 KB
ID:	1712961

    Comment


    • #3
      To exclude a category, use the -if- qualifier, e.g.,

      Code:
      gr bar (sum) TrendData if America!=4, over(America)
      Is there any way to make this bar graph more readable? 'A', 'B', and 'C' have small percentages while D has a very high percentage, making A, B, and C too small to read.
      In such cases, you may consider a transformation that accommodates both 0 and positive values and then relabel the y-axis. Cube roots are one example. See https://journals.sagepub.com/doi/pdf...867X0800800113 for some discussion on plotting on transformed scales.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float America double TrendData
      1  3
      2  2
      3  4
      4 69
      end
      label values America DevelopmentalCategories
      label def DevelopmentalCategories 1 "A", modify
      label def DevelopmentalCategories 2 "B", modify
      label def DevelopmentalCategories 3 "C", modify
      label def DevelopmentalCategories 4 "D", modify
      
      gen T3= TrendData^(1/3)
      set scheme s1mono
      gr bar TrendData, over(America)  ytitle("") saving(gr1, replace)
      gr bar T3, over(America) ytitle("") saving(gr2, replace)  ///
      ylab(0 "0"  `=1^(1/3)' "1" `=2^(1/3)' "2" `=5^(1/3)' "5" `=10^(1/3)' "10" ///
      `=20^(1/3)' "20" `=40^(1/3)' "40" `=60^(1/3)' "60" `=80^(1/3)' "80")
      
      gr combine gr1.gph gr2.gph
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	16.7 KB
ID:	1712981

      Last edited by Andrew Musau; 09 May 2023, 13:10.

      Comment


      • #4
        #1 and #2 don't show the code for the bar chart. But by eye the values appear to sum to about 100, across all categories A B C D and years. It seems more likely that percentages are better calculated separately for each year. True or not, the heights of the bars are here taken to be proportional to the amounts to show, and if that's wrong any guess that the values should be rescaled wouldn't, I think, imply a different graph from what I am going to suggest. .

        Here I used tabplot from the Stata Journal

        SJ-22-2 gr0066_3 . . . . . . . . . . . . . . . . Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
        Q2/22 SJ 22(2):467
        bug fixed; help file updated to include further references

        SJ-20-3 gr0066_2 . . . . . . . . . . . . . . . . Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
        Q3/20 SJ 20(3):757--758
        added new options frame() and frameopts() allowing framing
        of bars and so-called thermometer plots or charts

        SJ-17-3 gr0066_1 . . . . . . . . . . . . . . . . Software update for tabplot
        (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
        Q3/17 SJ 17(3):779
        added options for reversing axis scales; improved handling of
        axis labels containing quotation marks

        SJ-16-2 gr0066 . . . . . . Speaking Stata: Multiple bar charts in table form
        (help tabplot if installed) . . . . . . . . . . . . . . . . N. J. Cox
        Q2/16 SJ 16(2):491--510
        provides multiple bar charts in table form representing
        contingency tables for one, two, or three categorical variables


        and suggest that although the amounts for D are clearly much larger than other categories, they don't dominate the graph unduly. Naturally I agree with Andrew Musau that you have a choice to omit D from the graph. On this occasion cube roots are likely to seem puzzling to most potential readers, useful though they can be.

        A detail with tabplot is that you can easily show the values themselves, producing a kind of hybrid table and plot, hence the name.

        Naturally you can reach and specify different colours, and so forth.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte year str1 which float outcome
        1 "A" 1.5
        2 "A"   3
        3 "A" 1.5
        4 "A"   0
        1 "B"  .5
        2 "B" 1.5
        3 "B" 1.5
        4 "B"  .5
        1 "C" 1.5
        2 "C"   3
        3 "C"   4
        4 "C" 1.5
        1 "D"  22
        2 "D"  28
        3 "D"  15
        4 "D"  15
        end
        
        tabplot which year [iw=outcome] , showval xla(, nogrid)
        Click image for larger version

Name:	ABCDtabplot.png
Views:	1
Size:	19.7 KB
ID:	1713013

        Comment


        • #5
          Thank you so much, professors Nick Cox and Andrew Musau. While both tabplot and transformation scale are great ways to present this data, I think omitting 'D' will be much easier to read by my readers. My only concern is whether omitting " D " affects the individual percentages of A, B, and C.

          I am sorry I didn't include the code for the graph initially :

          graph bar (percent) TrendData, over(America) asyvars over(year) bar(1, color(red)) bar(2, color(blue)) bar(3, color(black)) legend(rows(1))

          Comment


          • #6
            As guessed, with your code, percents are calculated from the total of all values across both over() arguments.

            If some other calculation makes more sense, you need different syntax.

            Comment

            Working...
            X