Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graph combine with different legends

    Hi,

    I'd like to produce five graphs - one for each state - while excluding bars with zero-frequency. The code below achieves it:

    Code:
    levelsof state, local(state)
    local vlname: value label state /* Local list of value labels */
    local i=1
    foreach s of local state {
        local vl: label `vlname' `s' /* Local value label for state=s */
        preserve
        use taxes_paid, clear
        keep if state==`s'
            
        bysort taxes: egen tot=total(taxes_paid)
        *drop if tot==0
        
        graph bar taxes_paid if tot!=0, over(taxes, sort(1) descending) ///
            ysc(r(0 1)) yla(0(0.2)1) ///
            blabel(bar, format(%4.0f) size(small))  asyvars ///
            legend(rows(3)  size(small) position(6) ) scheme(modern) ///
            ytitle("") title("`vl'" ///
            , span size(medium)) graphregion(color(white) lwidth(vsmall)) ylabel(,angle(h)) name(f`i', replace)        
            
        local ++i
        restore
    }
    I'd also like to combine the graphs using a single legend, despite them having different legends. Any suggestions?
    Thank you

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(key state taxes taxes_paid)
     1 1 1 0
     1 1 2 1
     1 1 3 0
     1 1 4 0
     1 1 5 0
     2 2 1 1
     2 2 2 1
     2 2 3 0
     2 2 4 0
     2 2 5 0
     3 3 1 0
     3 3 2 0
     3 3 3 1
     3 3 4 0
     3 3 5 0
     4 4 1 1
     4 4 2 1
     4 4 3 1
     4 4 4 0
     4 4 5 0
     5 5 1 1
     5 5 2 1
     5 5 3 0
     5 5 4 0
     5 5 5 0
     6 5 1 1
     6 5 2 1
     6 5 3 0
     6 5 4 0
     6 5 5 0
     7 4 1 1
     7 4 2 1
     7 4 3 1
     7 4 4 0
     7 4 5 0
     8 5 1 1
     8 5 2 1
     8 5 3 1
     8 5 4 0
     8 5 5 0
     9 1 1 0
     9 1 2 0
     9 1 3 0
     9 1 4 0
     9 1 5 0
    10 3 1 1
    10 3 2 0
    10 3 3 0
    10 3 4 0
    10 3 5 0
    11 5 1 1
    11 5 2 0
    11 5 3 1
    11 5 4 1
    11 5 5 0
    12 5 1 1
    12 5 2 1
    12 5 3 0
    12 5 4 0
    12 5 5 0
    13 5 1 1
    13 5 2 1
    13 5 3 1
    13 5 4 0
    13 5 5 0
    14 5 1 1
    14 5 2 1
    14 5 3 0
    14 5 4 0
    14 5 5 0
    15 5 1 1
    15 5 2 0
    15 5 3 1
    15 5 4 1
    15 5 5 0
    16 5 1 1
    16 5 2 0
    16 5 3 0
    16 5 4 0
    16 5 5 0
    17 2 1 1
    17 2 2 1
    17 2 3 1
    17 2 4 0
    17 2 5 0
    18 5 1 1
    18 5 2 0
    18 5 3 1
    18 5 4 1
    18 5 5 1
    19 5 1 1
    19 5 2 0
    19 5 3 1
    19 5 4 0
    19 5 5 0
    20 5 1 1
    20 5 2 1
    20 5 3 1
    20 5 4 0
    20 5 5 0
    end
    forvalues i=1/5 {
    la def state `i' "State `i'", modify
    la def taxes `i' "Tax `i'", modify
    }
    la val state state
    la val taxes taxes
    
    replace taxes_paid=100*taxes_paid
    save taxes_paid, replace



  • #2
    Vince Wiggins of StataCorp wrote a program, -grc1leg-, that does exactly this. Although he is a StataCorp employee, this is not an official Stata command. But it will serve your purposes. You can install it from http://www.stata.com/users/vwiggins.

    Comment


    • #3
      I'd approach the task quite differently.

      Having to produce separate graphs and then combine them is entirely a self-inflicted burden. Just use by() option or a by: prefix throughout.

      There are several other puzzling details in your code.

      1. Multiplication of values by 100 doesn't fit your instruction to show axis scales from 0 to 1 and axis labels within that range.

      2. The value labels are just repetitive.

      3. A legend isn't needed as direct labelling is possible.

      Here's a bar chart a bit like what you seem to want and one using tabplot from the Stata Journal.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte(key state taxes taxes_paid)
       1 1 1 0
       1 1 2 1
       1 1 3 0
       1 1 4 0
       1 1 5 0
       2 2 1 1
       2 2 2 1
       2 2 3 0
       2 2 4 0
       2 2 5 0
       3 3 1 0
       3 3 2 0
       3 3 3 1
       3 3 4 0
       3 3 5 0
       4 4 1 1
       4 4 2 1
       4 4 3 1
       4 4 4 0
       4 4 5 0
       5 5 1 1
       5 5 2 1
       5 5 3 0
       5 5 4 0
       5 5 5 0
       6 5 1 1
       6 5 2 1
       6 5 3 0
       6 5 4 0
       6 5 5 0
       7 4 1 1
       7 4 2 1
       7 4 3 1
       7 4 4 0
       7 4 5 0
       8 5 1 1
       8 5 2 1
       8 5 3 1
       8 5 4 0
       8 5 5 0
       9 1 1 0
       9 1 2 0
       9 1 3 0
       9 1 4 0
       9 1 5 0
      10 3 1 1
      10 3 2 0
      10 3 3 0
      10 3 4 0
      10 3 5 0
      11 5 1 1
      11 5 2 0
      11 5 3 1
      11 5 4 1
      11 5 5 0
      12 5 1 1
      12 5 2 1
      12 5 3 0
      12 5 4 0
      12 5 5 0
      13 5 1 1
      13 5 2 1
      13 5 3 1
      13 5 4 0
      13 5 5 0
      14 5 1 1
      14 5 2 1
      14 5 3 0
      14 5 4 0
      14 5 5 0
      15 5 1 1
      15 5 2 0
      15 5 3 1
      15 5 4 1
      15 5 5 0
      16 5 1 1
      16 5 2 0
      16 5 3 0
      16 5 4 0
      16 5 5 0
      17 2 1 1
      17 2 2 1
      17 2 3 1
      17 2 4 0
      17 2 5 0
      18 5 1 1
      18 5 2 0
      18 5 3 1
      18 5 4 1
      18 5 5 1
      19 5 1 1
      19 5 2 0
      19 5 3 1
      19 5 4 0
      19 5 5 0
      20 5 1 1
      20 5 2 1
      20 5 3 1
      20 5 4 0
      20 5 5 0
      end
      forvalues i=1/5 {
      la def state `i' "State `i'", modify
      * la def taxes `i' "Tax `i'", modify
      }
      la val state state
      * la val taxes taxes
      
      replace taxes_paid=100*taxes_paid
              
      bysort state taxes: egen tot=total(taxes_paid)
      
      graph bar taxes_paid if tot!=0, over(taxes, sort(1) descending) ///
      by(state, note("")) ysc(r(0 110)) yla(0(20)100, ang(h)) ///
      blabel(bar, format(%4.0f) size(small))  ///
      ytitle("") graphregion(color(white) lwidth(vsmall)) name(G1, replace)
      
      egen toshow = mean(taxes_paid) if tot != 0, by(taxes state)
      egen tag = tag(taxes state)
      
      tabplot taxes state if tag [iw=toshow], subtitle("") xtitle("") separate(taxes) showval(format(%4.0f)) name(G2, replace)
      Click image for larger version

Name:	orgeira_G1.png
Views:	1
Size:	39.3 KB
ID:	1739072

      Attached Files

      Comment


      • #4
        Yet another graphic with more emphasis on ranking and ignoring zeros. (Multiple colours in practice don't seem to help (me) at all.)

        Code:
        graph hbar toshow if tag,  over(taxes, sort(1) descending) over(state) nofill blabel(bar, format(%4.0f)) ysc(alt r(0 110)) yla(0(25)100) ytitle(% whatever)
        Click image for larger version

Name:	orgeira_G3.png
Views:	1
Size:	39.4 KB
ID:	1739084

        Comment


        • #5
          Hi Clyde Schechter and Nick Cox,

          Thank you so much for your help.

          @Clyde - thanks for suggesting the command grc1leg. I tried it before but unfortunately couldn't make it work. It calls for the legend from one of the five graphs, none of which would contain the entire list of taxes in their legends as I exclude zero-frequency taxes to the graph (legend for state 1 here would only contain tax=2, state 2 taxes 1, 2 and 3).

          @Nick - thank you for these alternative graph suggestions. I really like your second and third graphs.
          The data I shared was for illustration - the full data actually contains 15 types of taxes, each of them labeled, leading to issues in graphs 1 and 3 (see below graph 1). Having a color-coded legend would make the graphs less crowded - do you know whether this would be feasible? Sorry for not being clearer.

          Thanks again

          Click image for larger version

Name:	G1.png
Views:	1
Size:	58.2 KB
ID:	1739090


          Comment


          • #6
            You should try hbar not bar to see 15 text items. You may need to be severe with abbreviations.

            A legend with 15 colours wouldn't help at all, I guess. Almost half the space would need to be dedicated to the legend itself and the effort needed for back and forth between legend and graph strict sense would backfire on you. .

            Comment


            • #7
              Thank you Nick Cox for your help.

              I was able to reproduce tabplot 2, and looked to display the sample size and the percentage as in your post in this thread, in such a way that the bar heights no longer represent the sample size as in the thread but also %.

              Click image for larger version

Name:	graph.png
Views:	1
Size:	45.2 KB
ID:	1739298


              However, I have three small questions:
              a) Is there a way to get different colors by row as in your example (not sure why the colors differ here)
              b) I would like to also display the percentage and sample size even if it's equal to zero (sample size or %). Do you know if this would be feasible?
              c) In the x-axis, values 1-3, 4-6 and 7-9 refer to three different concepts. I was wondering whether it would be possible to add a second x-axis label looking like this:

              Click image for larger version

Name:	graph2.png
Views:	1
Size:	38.0 KB
ID:	1739299

              Please find below the raw data and code used to generate the tabplot.

              Thank you so much

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float(id state var1_1 var1_2 var1_3 var2_1 var2_2 var2_3 var3_1 var3_2 var3_3)
              34 1 1 1 0 1 1 0 0 0 1
              36 1 1 . . 1 . . 1 0 .
              67 1 1 1 0 1 1 0 0 0 .
              42 1 1 . . 1 . . 1 . .
              39 1 1 1 0 1 0 0 0 0 0
               6 2 1 1 1 1 1 0 0 0 0
               7 2 1 1 1 1 1 . 1 1 .
               1 2 1 1 . 1 0 . 0 0 1
               3 2 1 1 0 1 1 0 0 0 .
               5 2 1 1 0 1 1 . 1 0 .
              12 3 1 1 . . . 0 . . .
              18 3 0 0 0 0 0 0 0 0 0
              13 3 1 1 . 1 1 . 1 . .
              17 3 1 1 1 1 1 1 0 0 .
              16 3 1 1 0 1 0 0 0 0 0
              24 4 1 1 . 0 0 0 1 1 .
              30 4 0 0 . . . 0 . 1 .
              26 4 1 1 1 1 1 1 0 0 .
              19 4 1 1 0 1 1 0 1 1 .
              29 4 1 1 . 1 1 . 1 1 .
              71 5 1 1 . 1 1 . 0 0 .
              74 5 1 1 1 1 1 0 0 0 1
              75 5 1 1 . 1 1 . 0 0 .
              73 5 1 1 1 1 1 . 0 0 .
              80 5 1 1 1 1 1 1 1 1 0
              end
              
              
              reshape long var1_ var2_ var3_, i(id state) j(authority)
              rename *_ *
              
              reshape long var, i(id state authority) j(concept)
              
              * Compile variables authority and concept into a single variable (in the x-axis)
              gen authority2=(concept-1)*3+authority
              drop authority concept
              rename authority2 authority
              
              * Compute sample size and % when response is provided
              gen available_info=!missing(var)
              bysort state authority: egen _freq=total(available_info)
              
              egen denom = total(_freq) if available_info==1, by(authority state)
              egen numer = total(_freq * (var == 1)), by(authority state)
              gen pc = 100 * numer/denom
              label var pc "% adopting"
              
              gen toshow = string(pc, "%2.0f") + "% (" + string(_freq) + ")"
              egen toshow2= mean(var) if _freq != 0, by(state authority)
              
              egen tag = tag(state authority available_info) if available_info==1
              
              tabplot state authority if tag [iw=toshow2], subtitle("") xtitle("") ytitle("") ///
                  separate(var) showval(toshow, mlabsize(tiny)) scheme(modern)

              Comment


              • #8
                a) Is there a way to get different colors by row as in your example (not sure why the colors differ here)
                You're asking for different colours according to variable var. As you don't want that, you shouldn't ask for that. separate(state) might be closer to what you want.

                b) I would like to also display the percentage and sample size even if it's equal to zero (sample size or %). Do you know if this would be feasible?
                This isn't supported by tabplot. A hole without a bar is seen when there are no corresponding observations. I wouldn't say that wasn't programmable, but it's not supported by tabplot.

                c) In the x-axis, values 1-3, 4-6 and 7-9 refer to three different concepts. I was wondering whether it would be possible to add a second x-axis label looking like this:
                See https://stackoverflow.com/questions/...-common-y-axis and https://journals.sagepub.com/doi/pdf...36867X19874264 for some technique.

                Comment

                Working...
                X