Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Diverging stacked bar chart

    Hi everyone,

    I want to display the frequencies of one categorical variable (attitude) over the categories of another (country) in a stacked bar chart. The attitude variable has 5 (Likert scale) categories. I have two specific aims, which I do not know how to accomplish using Stata:
    1. The middle or neutral category of the attitude variable should be at the center of the graph over all the countries.
    2. The stacked bars should be sorted according to the frequency of one (or more) categories of the attitude variable.
    For an example of what I have in mind, see:Using Stata 12.1 I tried the following which did not exactly yield the desired results:

    Code:
    slideplot hbar attitude , by(country) percent neg(1 2 3) pos(4 5)
    But slideplot cannot sort the bars. Also, you either have to chose whether the middle category 3 goes to the left or to the right, or you have to leave it out completely.

    Code:
    tab attitude, gen(attitudeCat)
    graph hbar attitudeCat1 attitudeCat2 attitudeCat3 attitudeCat4 attitudeCat5 ///
    , percent stack over(country, sort(5) descending)
    With graph hbar I can at least sort the stacked bars according to one of the categories, but I still cannot center them around their middle category.

    Does anyone have a suggestion of how to do this? Any help is much appreciated!

    Kind regards,
    Uwe

    I produced a tiny example dataset in case you need it:

    Code:
    clear
    
    input float attitude long country
    
     4 2
    
     4 2
    
     5 2
    
     4 2
    
     5 2
    
     2 2
    
     5 2
    
     4 2
    
     5 2
    
     4 2
    
     5 2
    
     1 2
    
     1 2
    
     1 2
    
    .c 2
    
     5 3
    
     3 3
    
     2 3
    
     4 3
    
     4 3
    
     5 3
    
    .c 3
    
     5 3
    
     5 3
    
     5 3
    
     4 3
    
     1 3
    
     1 3
    
     1 3
    
     1 3
    
    .c 4
    
     3 4
    
     4 4
    
     5 4
    
     3 4
    
     5 4
    
     4 4
    
     4 4
    
     5 4
    
     4 4
    
    .c 4
    
     5 5
    
     5 5
    
     3 5
    
     5 5
    
     4 5
    
     3 5
    
     4 5
    
     2 5
    
     4 5
    
     5 5
    
     4 5
    
     5 5
    
     2 5
    
     4 5
    
     4 5
    
     5 5
    
     4 5
    
     5 5
    
     3 5
    
     5 5
    
     4 5
    
    .c 5
    
     1 6
    
     4 6
    
     5 6
    
     4 6
    
     3 6
    
     4 6
    
     4 6
    
     5 6
    
     4 6
    
     2 6
    
     4 6
    
     4 6
    
     2 6
    
     1 6
    
     5 6
    
     2 6
    
     2 6
    
     5 6
    
     4 6
    
     4 6
    
     4 6
    
     4 7
    
     4 7
    
     5 7
    
     1 7
    
     4 7
    
     3 7
    
     4 7
    
     5 7
    
     1 7
    
     1 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     5 7
    
     4 7
    
     5 7
    
     2 7
    
     4 7
    
    end
    
    label values country country
    
    label def country 2 "A", modify
    
    label def country 3 "B", modify
    
    label def country 4 "C", modify
    
    label def country 5 "D", modify
    
    label def country 6 "E", modify
    
    label def country 7 "F", modify

  • #2
    I think this isn't the best but in the right direction.

    Comment


    • #3
      slideplot is from SSC, as you are asked to explain (FAQ Advice #12).

      This is just a flag that writing a similar command but based on twoway bar not graph bar is on my agenda, but my time scale is at best months.
      Wanting a middle category to straddle the axis is a strong reason for a rewrite.

      When I do it is more than usually subject to caprice as it will depend on my own work stumbling on an example where that kind of graph seems right.

      The variables mentioned in your code are not those in your sample data. No matter, as the structure in the latter is easier to work with.

      The remainder of this is irrelevant to you if you really want that specific graph, but may still be of interest to others.

      With your data example, I tried tabplot (SJ)

      Stata Journal subscribers can see http://www.stata-journal.com/article...article=gr0066

      Others can pay USD 11.75 (no royalties go to me!) or see a fairly detailed write-up at http://www.statalist.org/forums/foru...updated-on-ssc but they can and should install the program from the Stata Journal files if interested.


      Code:
       tabplot attitude country, showval percent(country) yasis subtitle(country distributions)
      Click image for larger version

Name:	notaslideplot.png
Views:	1
Size:	11.0 KB
ID:	1355520





      Let's say that we want to sort on the proportion of 4s and 5s. This is easy with a trick or two:


      Code:
      gen high = inlist(attitude, 4, 5) if attitude < . 
      egen phigh = mean(high), by(country)
      egen order = group(phigh country)
      * search labmask to find files and then install
      labmask order, values(country) decode
      tabplot attitude order, showval percent(country) yasis subtitle(country distributions) xtitle("")
      We first create an indicator for being 4 or 5. Then we get the proportion of such values by averaging. Then we order the countries on that measure. Just in case there are ties, we need to break those ties.

      The one devious trick is to get the values (in fact the value labels!) of country to be the value labels of the new ordering variable. For that labmask (SJ) is a lazy work-around.
      Click image for larger version

Name:	notaslideplot2.png
Views:	1
Size:	10.8 KB
ID:	1355526


      Last edited by Nick Cox; 05 Sep 2016, 08:31.

      Comment


      • #4
        The variables mentioned in your code are not those in your sample data.
        Incorrect. Sorry about that.

        Comment


        • #5
          Thanks for the swift answer! I am looking forward for your new command. In the meantime I'll stick to "conventional" stacked bar graphs.

          Comment


          • #6
            On the "small world" principle I note that Naomi and Richard in the paper you cite in #1 acknowledge my sending them references to earlier uses of the graph you want. There is a much longer paper https://www.jstatsoft.org/article/vi...i05/v57i05.pdf

            I have a 1939 reference as the earliest I know and a 1933 reference as the earliest I know to the graph in #3. So which is "conventional" is up for grabs.

            I will just note that rare categories are hard to spot on any stacked design, especially those with zero frequency.

            Comment

            Working...
            X