Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • modify ylabels of horizontal bar graph with indicator variables

    I am creating a horizontal bar graph with indicator variables, but would like to modify the ylabels such that it indicates the particular variable corresponding to the bar and are also modifiable (change "mean of rep_1" to "Avg. value for rep_1" for example). Here is a reproducible code:

    Code:
    sysuse auto, clear
    
    tab rep78, gen(rep_)
    
    graph hbar rep_*
    Note that my real dataset only has the indicator variables (i.e. there are no usable categorical variables), so any alternatives using some other transformation rep78 would not work in my specific case.

    Thanks for your help!

  • #2
    Code:
    graph hbar rep_* 
    forv i = 1/5 {
    gr_edit .legend.plotregion1.label[`i'].text = {}
    gr_edit .legend.plotregion1.label[`i'].text.Arrpush Avg. value of rep_`i'
    }

    Comment


    • #3
      George Ford answered the question nicely, but being constitutionally awkward I want to question the question. #1 asked for a way to change a legend so that legend items, already holding unnecessarily repeated text, get to hold even more unnecessarily repeated text. Is "mean of" less clear than "Avg. value of"? Clearly, I don't know the intended readership here, but I would much prefer

      1. Losing the legend (killing the key) if possible to use direct labelling instead.

      2. If "mean of" needs expansion or better explanation, doing that expansion or explanation just once, in an axis title, and avoiding abbreviations.

      3. Using variable labels, not variable names. Who prefers rep78_1 to an explanation of what that means?

      This is a case where a reproducible example of your real data would be preferable to a reproducible example based on accessible data.

      Here is some technique. I am quite possibly missing something simpler, but this code is adaptable. People preferring to use frames instead is fine by me. I use technique that can be used in Stata several versions back.

      Code:
      webuse nlswork, clear
      
      tempname where 
      
      postfile `where' str32 name str80 varlabel mean using mybar, replace 
      
      foreach v in msp nev_mar collgrad not_smsa c_city south union { 
          su `v', meanonly 
          post `where' ("`v'") ("`: var label `v''") (r(mean))
      }
      
      postclose `where'
      
      use mybar 
      
      list 
      
      graph hbar mean, over(varlabel, sort(1) descending) ytitle(Proportion of people) ysc(alt) name(G1, replace)
      
      graph dot mean, over(varlabel, sort(1) descending) linetype(line) lines(lc(gs12) lw(vthin)) ytitle(Proportion of people) ysc(alt) name(G2, replace)
      Code:
           +----------------------------------------------------+
           |     name                       varlabel       mean |
           |----------------------------------------------------|
        1. |      msp   1 if married, spouse present   .6029174 |
        2. |  nev_mar             1 if never married   .2296795 |
        3. | collgrad          1 if college graduate   .1680451 |
        4. | not_smsa                  1 if not SMSA   .2824441 |
        5. |   c_city              1 if central city    .357218 |
        6. |    south                     1 if south   .4095562 |
        7. |    union                     1 if union   .2344318 |
           +----------------------------------------------------+
      I would never publish either of these graphs as they come -- the variable labels are in turn too repetitive, and some indicators seem more naturally complemented -- but the point is that you now have data for the graph in an easily managed form.

      For more on why it can make sense to put the horizontal axis at the top, see https://www.stata-journal.com/articl...article=gr0053




      Click image for larger version

Name:	arthur_G1.png
Views:	1
Size:	22.5 KB
ID:	1727101
      Click image for larger version

Name:	arthur_G2.png
Views:	1
Size:	19.3 KB
ID:	1727102

      Comment


      • #4
        In fact

        Code:
        statplot msp nev_mar collgrad not_smsa c_city south union , ysc(alt) varopts(sort(1) descending) ytitle(Proportion of people)
        gets you the first graph above and

        Code:
        statplot msp nev_mar collgrad not_smsa c_city south union , ysc(alt) varopts(sort(1) descending) ytitle(Proportion of people) recast(dot) linetype(line) lines(lc(gs12) lw(vthin))
        gets you the second.

        statplot must be installed from SSC.

        Comment


        • #5
          One thing I found strange is that the bar does not use the actual labels assigned to the variables, yet "nolabel" is an option.

          Comment


          • #6

            George Ford #5 I think you are talking about graph hbar or graph bar with several variables. What happens internally seems equivalent to a collapse to get a reduced dataset and in that the variable labels seem to disappear, which from many points of view is not a feature.

            In #3 it can be seen that I take care to pass the variable labels to the data used in graphics, and although we do that in a different way in statplot, it's part of the same philosophy to keep track of variable labels and map them to value labels, or the equivalent.

            More elaborate code would use the variable name if there is no variable label.

            Any one who thinks that programming bar charts sounds utterly trivial should try doing it. Every user who wants a bar chart thinks of their goal as a simple and straightforward example, and most users are correct on that, but collectively there are dozens of possible variants.

            Comment

            Working...
            X