Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • plotting BOX PLOTS with 2 categories adjacent to each other on same axes

    Hi I would like to plot a box plot graph, with 2 categorical variables (procedure =1 and procedure =2) adjacent to each other for each category. whICH WOULD look like this:





    Click image for larger version

Name:	Screenshot 2024-08-18 at 20.00.43.png
Views:	1
Size:	59.6 KB
ID:	1761862



    I've used this code:


    Code:
    graph box inactive_total light_total moderate_vigor_total vigorous_total, ///
        over(procedure, label(1 "Procedure 1" 2 "Procedure 2")) ///
        bar(1, fcolor(blue)) marker(1, mcolor(black)) ///
        bar(2, fcolor(red)) marker(2, mcolor(black)) ///
        bar(3, fcolor(green)) marker(3, mcolor(black)) ///
        bar(4, fcolor(grey)) marker(4, mcolor(black)) ///
        ytitle("Duration in mins/week") ///
        legend(order(1 "Inactive" 2 "Light Activity" 3 "Moderate Activity" 4 "Vigorous Activity") ///
        ring(1) row(1) pos(6)) ///
        note("Box plots by Procedure")
    However this gives me an error

    I am able to plot the two separate graph for procedure =1 and procedure =2 but these are produced on different axes as seen from this graph code

    Code:
    graph box inactive_total light_total moderate_vigor_total vigorous_total if procedure==1, bar(1,  fcolor(blue)) marker(1, mcolor(black)) ///
    bar(2, fcolor(red)) marker(2, mcolor(black))  ///
    bar(3,  fcolor(green)) marker(3, mcolor(black)) ///
    bar(4,  fcolor(grey)) marker(4, mcolor(black)) ///
    ytitle(Duration in mins/week) ///
    legend(order (1 "Inactive" 2 " Light Activity" 3 "Moderate-to-Vigorous Activity" 4 "Vigorous Activity") ring(1) row(1) pos(6)) name(g1, replace)
    
    graph box inactive_total light_total moderate_vigor_total vigorous_total if procedure==2, bar(1,  fcolor(blue)) marker(1, mcolor(black)) ///
    bar(2, fcolor(red)) marker(2, mcolor(black))  ///
    bar(3,  fcolor(green)) marker(3, mcolor(black)) ///
    bar(4,  fcolor(grey)) marker(4, mcolor(black)) ///
    ytitle(Duration in mins/week) ///
    legend(order (1 "Inactive" 2 " Light Activity" 3 "Moderate-to-Vigorous Activity" 4 "Vigorous Activity") ring(1) row(1) pos(6)) name(g2, replace)
    
    
    graph combine g1 g2, name(combined, replace) note("Procedure 1 vs Procedure 2")
    Is it possible to plot on the same axis, adjacent to each other as seen in the screenshot?

    Many thanks

  • #2
    This needs a cross-reference to https://www.statalist.org/forums/for...-of-this-graph

    The data in the parallel thread are IMO utterly unsuited to box plots without drastic treatment.

    I'll address the Stata question of four outcomes and one binary predictor and how to shuffle the boxes. In essence, you need a different data layout.

    Code:
    clear 
    set obs 100 
    set seed 314159 
    
    gen procedure = 1 + (_n > 50)  
    
    forval j = 1/4 { 
        gen y`j' = rnormal(`j', 1) + procedure 
    }
    
    * not wanted 
    graph box y*, over(procedure)
    
    gen id = _n 
    
    reshape long y, i(id) j(which)
    
    label def which 1 inactive 2 whatever 3 you 4 want
    label val which which 
    
    label def procedure 1 something 2 else 
    label val procedure procedure 
    
    graph box y, over(procedure) by(which, row(1) note(""))

    Comment


    • #3
      Here's the same idea using your data from the linked thread. In essence the reshape long is that suggested there.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float(inactive light_activity mod_to_vigorous_activity vigorous_activity procedure)
      1301.083  95.667  43.167  .083 1
          1345   72.25  22.583  .167 2
          1280 122.583  37.333  .083 1
      1082.417 192.667   164.5  .417 2
      1222.083   172.5  45.417     0 1
      1337.583  65.167   37.25     0 .
       1297.75 108.583  33.667     0 2
        1361.5  62.833      15  .667 1
      1302.917  99.833   37.25     0 1
        1238.5  125.75  75.083  .667 2
      1097.917 215.833 124.667 1.583 1
       1162.25 167.667 108.833  1.25 1
      1214.167 184.667  41.167     0 2
       1078.25 142.833 216.083 2.833 1
      1326.583  86.833  26.583     0 1
      1263.417 122.167  53.917    .5 1
          1368  44.583  26.417     1 2
      1185.667 159.667  93.667     1 1
        1294.5   122.5      23     0 2
        1308.5 113.083  18.417     0 1
      end
      
      rename inactive inactive_activity
      
      gen id = _n if procedure < .
      
      reshape long @activity, string i(id) j(WHICH)
      
      replace WHICH = trim(subinstr(WHICH, "_", " ", .))
      
      replace WHICH = subinstr(WHICH, "mod", "moderate", .)
      
      label define which 1 inactive 2 light 3 "moderate to vigorous" 4 vigorous
      
      encode WHICH, gen(which) label(which)
      
      graph box activity, over(procedure) by(which, b1title(procedure) row(1) note(""))
      Click image for larger version

Name:	activity.png
Views:	1
Size:	21.4 KB
ID:	1761875

      Comment


      • #4
        You can get different colours with these further commands

        Code:
        separate activity, by(which) veryshortlabel 
        
        graph box activity?, over(procedure) nofill by(which, b1title(procedure) legend(off) row(1) note(""))

        Comment


        • #5
          thanks for this, if I may ask, what color scheme are you using please?

          Comment


          • #6
            stcolor which is the default in Stata 18.

            See https://www.statalist.org/forums/help#version for longstanding advice to tell us if you are using an earlier version than that current.

            See https://www.statalist.org/forums/for...scheme-stcolor for how to get the colours of stcolor in earlier versions.

            In #2 of https://www.statalist.org/forums/for...-of-this-graph I flagged a standard graphics point, not to mix red and green given the prevalance of difficultty in distinguishing them.

            I picked up an earlier suggestion and tried square root scale.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(inactive light_activity mod_to_vigorous_activity vigorous_activity procedure)
            1301.083  95.667  43.167  .083 1
                1345   72.25  22.583  .167 2
                1280 122.583  37.333  .083 1
            1082.417 192.667   164.5  .417 2
            1222.083   172.5  45.417     0 1
            1337.583  65.167   37.25     0 .
             1297.75 108.583  33.667     0 2
              1361.5  62.833      15  .667 1
            1302.917  99.833   37.25     0 1
              1238.5  125.75  75.083  .667 2
            1097.917 215.833 124.667 1.583 1
             1162.25 167.667 108.833  1.25 1
            1214.167 184.667  41.167     0 2
             1078.25 142.833 216.083 2.833 1
            1326.583  86.833  26.583     0 1
            1263.417 122.167  53.917    .5 1
                1368  44.583  26.417     1 2
            1185.667 159.667  93.667     1 1
              1294.5   122.5      23     0 2
              1308.5 113.083  18.417     0 1
            end
            
            rename inactive inactive_activity
            
            gen id = _n if procedure < .
            
            reshape long @activity, string i(id) j(WHICH)
            
            replace WHICH = trim(subinstr(WHICH, "_", " ", .))
            
            replace WHICH = subinstr(WHICH, "mod", "moderate", .)
            
            label define which 1 inactive 2 light 3 "moderate to vigorous" 4 vigorous
            
            encode WHICH, gen(which) label(which)
            
            graph box activity, over(procedure) by(which, b1title(procedure) row(1) note(""))
            
            gen sqrt_activity = sqrt(activity)
            
            separate sqrt_activity, by(which) veryshortlabel
            
            graph box sqrt_activity?, over(procedure) nofill by(which, b1title(procedure) legend(off) row(1) note("")) ///
            yla(0 10 "100" 20 "400" 30 "900" 40 "1600") ytitle("Activity (min/week)" "square root scale")
            Click image for larger version

Name:	activity2.png
Views:	1
Size:	26.5 KB
ID:	1761884

            Comment

            Working...
            X