Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making box plots more visible

    Hello

    I'm a stata newbie with several questions which I will be posting subsequently in this forum so pardon my lack of knowledge on stata.

    I am trying to plot a graph box but the results show show several boxplots which am unable to see - does anyone know how I can sort this ? . The code am using is graph box (variable), by Variable).



  • #2
    Please see pic below
    Attached Files

    Comment


    • #3
      I want to make the box plots and numbers of the x - axis visible
      Attached Files

      Comment


      • #4
        You need to restrict the number of graphs instead of plotting everything at one go. Compare the following:

        Code:
        sysuse nlsw88, clear
        graph box wage, by(ind)
        summarize ind
        graph box wage if inrange(ind, 1, 6), by(ind)
        graph box wage if inrange(ind, 7, 12), by(ind)
        Last edited by Andrew Musau; 03 Feb 2021, 03:36.

        Comment


        • #5
          Angela: Please read https://www.statalist.org/forums/help#stata on (1) showing the precise commands you used (2) showing .png images not .gph images.

          In your visible graph I (think I) see 60 box plots but you have little hope of improving that without doing something quite different.

          I have less idea about what is happening in your .gph attachment.

          Here is concrete technique for a problem with about 60 categories. You should be able to run this code.

          Code:
          webuse nlswork, clear 
          
          egen max = max(ln_wage), by(wks_ue)
          egen min = min(ln_wage), by(wks_ue)
          foreach q in 25 50 75 {
              egen p`q'  = pctile(ln_wage), by(wks_ue) p(`q')
          }
          
          egen tag = tag(wks_ue)
          
          twoway rspike max p75 wks_ue if tag, color(black) || scatter p50 wks_ue if tag, ms(dh) color(red) || rspike p25 min wks_ue if tag, color(black) legend(off) ytitle(`: var label ln_wage') yla(, ang(h)) scheme(s1color)

          Click image for larger version

Name:	manybox.png
Views:	1
Size:	31.7 KB
ID:	1592673


          The main idea is to put all the box plots in one graph. Multiple graphs using by() just entail far too much by way of scaffolding and repeated detail.

          Beyond that,

          1. Even very thin boxes can get too messy. Omitting the box altogether may sound crazy, but it works quite well, I suggest. See also https://www.statalist.org/forums/for...-without-boxes

          2. I don't see a need to repeat exactly whatever John Tukey did or what graph box or graph hbox does. I've had good graphs too out of whiskers that go out to 5th and 95th percentiles.

          3. This is an easy difficult problem -- the subdividing variable is numeric. If your subdividing variable is categorical with long names -- I can't read any of the text in #2 -- you will probably need to make the box plots horizontal and increase the vertical size of the graph to have any chance of the text being readable.

          4. Indeed, you may need to split into about 30 categories in each of two displays.

          Comment


          • #6
            I experimented with made-up text labels and most of the available trickery -- different axes, bigger graph size, smaller labels -- to show what can be done.

            Code:
            webuse nlswork, clear 
            
            egen max = max(ln_wage), by(wks_ue)
            egen min = min(ln_wage), by(wks_ue)
            foreach q in 25 50 75 {
                egen p`q'  = pctile(ln_wage), by(wks_ue) p(`q')
            }
            
            egen tag = tag(wks_ue)
            
            levelsof wks_ue, local(levels)
            
            local text "Some enchanted evening, you may see a stranger, You may see a stranger across a crowded room"
            
            foreach x of local levels { 
                local this = substr("`text'", `x', 15)
                label define wks_ue `x' "`this'", add 
            }
            
            label val wks_ue wks_ue 
            
            twoway rspike max p75 wks_ue if tag, horizontal color(black) || scatter  wks_ue p50 if tag, ms(dh) color(red) || rspike p25 min wks_ue if tag, horizontal color(black) legend(off) xtitle(`: var label ln_wage') yla(`levels', noticks valuelabel ang(h) labsize(vsmall)) scheme(s1color) ysize(8) ytitle("")
            Click image for larger version

Name:	manybox2.png
Views:	1
Size:	32.3 KB
ID:	1592693

            Comment

            Working...
            X