Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bar or dot charts with three-way structure

    This is a forking of the thread https://www.statalist.org/forums/for...g-aspect-ratio started by paulvonhippel

    I don't want to detract or distract if the theme is grc1leg -- written by VInce Wiggins --

    Code:
    net describe grc1leg, from(http://www.stata.com/users/vwiggins)
    or

    grc1leg2 -- written by Mead Over ---

    Code:
    net describe grc1leg2, from(http://digital.cgdev.org/doc/stata/MO/Misc)

    Rather, Paul's dataset is a nice sandbox to explore a graphical challenge. Paul has 4 variables, 7 regions and 2 time periods -- let's call that three-way structure and recognise that 1 variable and 3 categorical controls could be similar.

    The underlying message here is to use or if need be restructure your data -- if you can -- so that one handle is a by() option. That's a way to avoid graph combine and its side-effects and is written up already at https://www.stata-journal.com/articl...article=gr0085 -- but a direct example rarely hurts.

    I don't recognise the variables from their TLAs (three letter acronyms), so let's just mention that variable labels or value labels would help in a published version.

    In graphics there are always choices -- to match the data, the intended emphasis, the readership and personal taste. Here are some choices and there are others.

    Code:
    clear
    
    input byte r_id str26 region float(period year) double(gfd shr kme klf)
    1 "East Asia & Pacific"        1 1  8325.703482570614 221713507.51180354 312545535391.22266  9.884943733931932
    1 "East Asia & Pacific"        2 2  9962.102881752555  291994199.0371205  374296140523.1998 10.281044165113197
    2 "Europe & Central Asia"      1 1   24102.8204969368   538919571.786615 538401194443.77637 19.022858084454136
    2 "Europe & Central Asia"      2 2  25799.98590965781  654494471.3396385  581745338410.1671 23.309416601936224
    3 "Latin America & Caribbean"  1 1  9455.280580949593  80081604.26769644  76151826017.24394  6.093767684310448
    3 "Latin America & Caribbean"  2 2   9545.11053660585 106220095.91957214  98404982310.05264  7.355575546132802
    4 "Middle East & North Africa" 1 1  7373.541711250463   89946551.4761219  81922279949.52919 16.905612650025958
    4 "Middle East & North Africa" 2 2 7777.9616878091565  99544773.86252138 106317425321.04172 19.484252374025836
    5 "North America"              1 1  49559.27810369779           83961914       2.209272e+11  7.247645902684366
    5 "North America"              2 2  53454.28768786681           98022165 270455750000.00006  7.239042277091585
    6 "South Asia"                 1 1 1368.5156808676866  12190005.91249332        2.40586e+10  11.19949571592668
    6 "South Asia"                 2 2 1780.9751084779448 23091459.198763825        3.49224e+10 10.671254320423737
    7 "Sub-Saharan Africa"         1 1  1645.103030610854 38776587.938939884 29995171741.764538 10.925456728424516
    7 "Sub-Saharan Africa"         2 2 1682.6993478548713  45598841.52158082  32372205773.37823  10.55520839362721
    end
    label values year fiveyr
    label def fiveyr 1 "2010-2014", modify
    label def fiveyr 2 "2015-2019", modify
    
    replace shr = shr/1e9
    
    replace kme = kme/1e9
    
    rename (gfd shr kme klf) (whatever=)
    
    reshape long whatever, i(region year) j(which) string
    
    label def WHICH 1 gfd 2 shr 3 kme 4 klf
    encode which, gen(WHICH)
    
    set scheme s1color
    
    * https://www.statalist.org/forums/forum/general-stata-discussion/general/1598767-myaxis-available-from-ssc-reorder-categorical-variables-especially-for-later-table-or-graph-use
    myaxis region2 = region, sort(mean whatever) subset(year==2 & which=="gfd") descending
    
    graph dot whatever, over(year) over(region2) by(WHICH, xrescale note("")) scheme(s1color)  asyvars ytitle("") xsize(9) marker(1, ms(Oh)) marker(2, ms(+)) linetype(line) lines(lc(gs12) lw(thin))
    Click image for larger version

Name:	vonhippel.png
Views:	1
Size:	77.3 KB
ID:	1615163



    The key points as I see them include

    1. As above, a by() option can yield a multipanel display and take care of much of the housekeeping implied.

    2. Dot charts as (re-)introduced by W.S. Cleveland in 1984 can sometimes work as well as or better than bar charts.

    3. Alphabetical order is good in dictionaries and directories, but usually a weak choice for graphs or even tables. myaxis from SSC is a tool to re-order categories.

    4. The xrescale suboption is helpful (needed!) to cope with different scales, but even so re-scaling (here, dividing two variables by a billion) can help too.

    5. Some might regard the region labels in the right-hand column as redundant. Oddly, the noiylabel suboption of by() is ignored here but you can chop them out using the Graph Editor.

    Whether logarithmic scale would help here I leave on one side.
Working...
X