Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing Ordered hbars with Confidence Intervals

    Dear Statalisters,

    Hope this message finds your summer well. I am writing regarding a graphing issue that I have with Stata. I ran this series of logit models for about 60 countries and wanted to graph the regression coefficients and their associated confidence intervals of a particular predictor for all these countries. In the envisaged graph, the x-axis corresponds to the coefficients and the y-axis for the country variable (value-labeled). I'd also like the coefficients graphed in an ascending/descending order based on their magnitude. I have these coefficients and their confidence intervals collected in a new data file, along with their corresponding country codes. So the four variables in this new country data file include country (value-labeled), coefmean (reg coefficient estimates), coefhi (high end for the 95% CI), and coeflo (low end for the 95% CI). I tried several Stata graphing commands and code chunks, but none of these worked, as expected Frankly, they don't look right to me to begin with.

    For example, the following line is fine with ordering the coefficients and correctly showing country labels, but it doesn't accommodate confidence intervals

    Code:
    graph hbar coefmean, over(country, sort(1) descending label(labsize(*0.40)))
    Graph01.png

    I also tried the following code snippet
    Code:
    gsort -coefmean
    gen rank = _n
    twoway (scatter rank coefmean, mcolor(navy)) ///
           (rcap coefhi coeflo rank, horizontal lcolor(black))
    I got the following graph.
    Graph02.png

    This graph presents the reg coefficient estimates and associated confidence intervals nicely, but doesn't have country labels. Any help/pointer would be very much appreciated.

    Jun Xu, PhD
    Professor
    Department of Sociology
    University of Macau

  • #2
    You need to label the values of the rank variable with the country names and then display these value labels on the axis. The community-contributed command labmask from the Stata Journal can help here.

    Code:
    search labmask
    If you cannot make progress, present a reproducible example.

    Comment


    • #3
      The suggestion in #2 should work, but it is also fairly easy to do "by hand". Here is a dummy example:

      Code:
      * CREATE DUMMY DATASET
      clear
      input str3 country_str
      "ETH"
      "GBR"
      "NZL"
      "IND"
      "MEX"
      "COL"
      end
      
      set seed 123
      
      gen coefmean = runiform(0, 0.4)
      gen len = runiform(0.01, 0.2)
      gen coefhi = coefmean + len
      gen coeflo = coefmean - len
      drop len
      
      * THE WORK BEGINS HERE
      gsort -coefmean
      gen rank = _n
      
      su rank, meanonly
      forval i = 1 / `=r(max)' {
          label define COUNTRY_RANKS `i' "`=country_str[`i']'", add
      }
      label values rank COUNTRY_RANKS
      
      * MAKE THE GRAPH
      twoway (scatter rank coefmean, mcolor(navy)) ///
             (rcap coefhi coeflo rank, horizontal lcolor(black)) ///
             , ylabel(, valuelabel) ytitle("Countries")
      which produces:
      Click image for larger version

Name:	Screenshot 2025-07-31 at 7.50.14 PM.png
Views:	1
Size:	99.4 KB
ID:	1780541

      Comment


      • #4
        Dear Andrew,

        I tried the user-written labmask command, and it did work. The tricky part (to me, for which I didn't figure out) is that my original country variable is a numerical one with string labels. So I have to turn it into a real string variable first using the decode command. Below are the codes and the resultant graph,
        Code:
        decode country, gen(cnt)
        sort coefmean
        cap drop rank
        gen rank = _n
        labmask rank, values(cnt)
        twoway (rbar coefhi coeflo rank, barwidth(1) horizontal) ///
            (scatter rank coefmean, ytitle("Country") xtitle("Coef") ylab(1(1)66, valuelabel labsize(tiny)) msymbol(|) legend(off))
        cntGraph.png
        But a related issue, how to clean up the graphing background, which appears to have a lot of light-colored dashes....Thanks a lot!
        Attached Files

        Comment


        • #5
          Thanks a lot, Hemanshu!

          Comment


          • #6
            how to clean up the graphing background, which appears to have a lot of light-colored dashes
            I think those are gridlines, which you can switch off with the following amendment to the ylab option:
            Code:
            ylab(1(1)66, valuelabel labsize(tiny) nogrid)

            Comment


            • #7
              Thanks a lot again, Hemanshu!

              Comment


              • #8
                You could also turn off the y-ticks on a categorical axis, but the graph looks busy — perhaps they actually help.

                Code:
                ylab(1/66, val labsize(tiny) nogrid noticks)
                I would also delete the y-axis title "Country," as it is self-evident.
                Last edited by Andrew Musau; 31 Jul 2025, 08:46.

                Comment


                • #9
                  I agree the graph looks too busy, and it's hard to figure out which bar corresponds to which country. You might want to take off the country names from the y axis entirely, and put the country names inside the horizontal bars?

                  Something like:
                  Code:
                  ylab(none, nogrid noticks) mlabel(cnt)

                  Comment


                  • #10
                    Thank you both! These are great suggestions for further improving the appearance of the graph.

                    Comment

                    Working...
                    X