Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • improving presentation of this graph





    Click image for larger version

Name:	Screenshot 2024-08-15 at 21.21.53.png
Views:	1
Size:	27.2 KB
ID:	1761685
    Click image for larger version

Name:	Screenshot 2024-08-15 at 21.20.34.png
Views:	1
Size:	206.5 KB
ID:	1761684


    Good evening

    I would like your advice regarding improving the presentation of this graph, as you can see there are 4 categories (different coloured: blue - red- green - black)

    I generating the graph using this code

    Code:
    graph box v1 v2 v3 v4 if procedure==1, bar(1,  fcolor(blue)) marker(1, mcolor(blue)) ///
    bar(2, fcolor(red)) marker(2, mcolor(blue))  ///
    bar(3,  fcolor(green)) marker(3, mcolor(blue)) ///
    bar(4,  fcolor(grey)) marker(4, mcolor(blue)) ///
    ytitle(Duration in mins) ///
    legend(order (1 "v1" 2 " v2" 3 "v3" 4 "v4") ring(1) row(1) pos(6))
    Now for v4, I have very small values , ranging from 0 to 2.833

    Unfortunately looking at this graph, it looks like it's 0.

    I then tried, ok let me just present v2-v4 (as these have small ranges), I also tried to edit the scale

    Code:
    graph box v2 v3 v4 if procedure==1, ylabel(0(2)280) ytick(0(2)280) bar(1,  fcolor(blue)) marker(1, mcolor(blue)) ///
    bar(2, fcolor(red)) marker(2, mcolor(blue))  ///
    bar(3,  fcolor(green)) marker(3, mcolor(blue)) ///
    ytitle(Duration in mins) ///
    legend(order (1 " v2" 2 "v3" 3 "v4") ring(1) row(1) pos(6))
    However, the scale is too small for the graph, as you can see the numbers overlap on the yaxis.

    Question 1:
    What is your advice?

    Question 2:
    In addition, I would like to plot the same graph but for procedure ==2 , and ideally I would like to superimpose both graphs of procedure = 1 and procedure = 2 onto each other, so there are for eg, 2 blue box plots adj to each other, 2 red box plots adj to each other, 2 green box plots adjacent to each other, 2 black box plots adjacent to each other, which would look like this:

    Is it possible?

    I have tried

    Code:
    graph box inactive light_activity mod_to_vigorous_activity vigorous_activity if procedure ==1
    || graph box inactive light_activity mod_to_vigorous_activity vigorous_activity if procedure ==2
    
    
    //error
    Click image for larger version

Name:	Screenshot 2024-08-15 at 21.27.32.png
Views:	1
Size:	62.3 KB
ID:	1761686

  • #2
    If zeros were not present, my answer would be use a transformed scale, either logarithm or reciprocal. A particular rationale for reciprocal is that the reciprocal of a duration has clear dimensional meaning as some kind of speed or rate.

    As zeros are present, some kind of complication might need to be entertained. I'd probably segregate the zeros. Is there some substantive interpretation? Are these a different group of people (rats, whatever)?

    I'd prefer a real or realistic data example to say more.

    Don't mix red and green on a plot. Many people have severe difficulty distinguishing them.

    Time of day references often look odd depending on longitude and people's personal habits over when they look at the forum.

    Comment


    • #3
      Thank you for providing an informative reply.
      however, I didn’t really understand regarding creating a reciprocal. These are a group of human beings.

      the Y axis represents total duration in minutes of activity .

      do you think logging the scale would make it easier for the reader to read ? The reader may easily understand the value of 0 mins to 1500mins, I’m not so sure with a logscale.

      in terms of creating a reciprocal , how would you go about this ? Does this mean creating another variable or tweaking something in the code ?

      would you be ever so kind to give an example

      many thanks

      Comment


      • #4
        The reciprocal of variable x would be generated by something like


        Code:
        gen recx = 1/x
        If two people take 200 min and 100 min to do something, the corresponding reciprocals are 1/200 and 1/100 respectively and their units are 1 / min.

        A multiplicative factor may be convenient.

        The interpretation of speed or rate is easiest if the time is (say) the time taken to do a task or a reaction time, but may be reasonable otherwise.

        The reciprocal of 0 is not defined but that doesn't rule out plotting zeros in the margin of some graph.

        In turn I am still optimistic about seeing a data example.

        Who are your readers that they don't know logarithms? Fellow statistical people should all know logarithms.

        Keene ON. The log transformation is special. Stat Med. 1995 Apr 30;14(8):811-9. doi: 10.1002/sim.4780140810. PMID: 7644861.

        is one of many key references.

        (Again, the logarithm of 0 is not defined but doesn't rule out plotting zeros in the margin of some graph.)

        Comment


        • #5
          Hi thanks for your input

          I've tried the reciprocal, it addresses the problem for vigorous but does not do the same for the inactive, light, mod-vigorous as seen here:


          Click image for larger version

Name:	Screenshot 2024-08-16 at 16.28.46.png
Views:	1
Size:	24.4 KB
ID:	1761756


          Using the code below:

          Code:
          gen testinactive = 1/inactive
          gen testlight = 1/light_activity
          gen testmod= 1/mod_to_vigorous_activity
          gen testvig= 1/vigorous_activity
          
          
          graph box testinactive testlight testmod testvig if procedure==2, bar(1,  fcolor(blue)) marker(1, mcolor(black)) ///
          bar(2, fcolor(red)) marker(2, mcolor(black))  ///
          bar(3,  fcolor(green)) marker(3, mcolor(black)) ///
          bar(4,  fcolor(grey)) marker(4, mcolor(black)) ///
          ytitle(Duration in mins/week) ///
          legend(order (1 "Inactive" 2 " Light Activity" 3 "Moderate-to-Vigorous Activity" 4 "Vigorous Activity") ring(1) row(1) pos(6)) note("Procedure==2")
          Code:
          I may just have to accept the results in post 1 (first graph) . 
          if there are no other solutions, is there a way how to superimose the same graph for procedure 1 and procedure 2 on the same graph makes the boxplots adjacent to each other (just like eg given in post 1(grayscale example)
          
          
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float(inactive light_activity mod_to_vigorous_activity vigorous_activity procedure)
          1301.083  95.667  43.167  .083 1
              1345   72.25  22.583  .167 2
              1280 122.583  37.333  .083 1
          1082.417 192.667   164.5  .417 2
          1222.083   172.5  45.417     0 1
          1337.583  65.167   37.25     0 .
           1297.75 108.583  33.667     0 2
            1361.5  62.833      15  .667 1
          1302.917  99.833   37.25     0 1
            1238.5  125.75  75.083  .667 2
          1097.917 215.833 124.667 1.583 1
           1162.25 167.667 108.833  1.25 1
          1214.167 184.667  41.167     0 2
           1078.25 142.833 216.083 2.833 1
          1326.583  86.833  26.583     0 1
          1263.417 122.167  53.917    .5 1
              1368  44.583  26.417     1 2
          1185.667 159.667  93.667     1 1
            1294.5   122.5      23     0 2
            1308.5 113.083  18.417     0 1
          end

          Comment


          • #6
            Thanks for the data example. What you don't mention is that you would get messages about missings from the instructions to divide by zero.

            Given the zeros, and the implication otherwise that reciprocal is too strong a transformation any way, I suggest cube root as a slightly unusual compromise. Its small virtues are that it handles zeros gracefully and that it appears to be a good scale on which to compare variability; I can't add a scientific rationale unless you can.

            Much the same could be said about square roots.

            I didn't go as far as specifying a procedure. If you really have only 28 people for procedure 2, no graph form is a great idea unless it shows sample size explicitly.

            I used stripplot from SSC.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(inactive light_activity mod_to_vigorous_activity vigorous_activity procedure)
            1301.083  95.667  43.167  .083 1
                1345   72.25  22.583  .167 2
                1280 122.583  37.333  .083 1
            1082.417 192.667   164.5  .417 2
            1222.083   172.5  45.417     0 1
            1337.583  65.167   37.25     0 .
             1297.75 108.583  33.667     0 2
              1361.5  62.833      15  .667 1
            1302.917  99.833   37.25     0 1
              1238.5  125.75  75.083  .667 2
            1097.917 215.833 124.667 1.583 1
             1162.25 167.667 108.833  1.25 1
            1214.167 184.667  41.167     0 2
             1078.25 142.833 216.083 2.833 1
            1326.583  86.833  26.583     0 1
            1263.417 122.167  53.917    .5 1
                1368  44.583  26.417     1 2
            1185.667 159.667  93.667     1 1
              1294.5   122.5      23     0 2
              1308.5 113.083  18.417     0 1
            end
            
            rename inactive inactive_activity 
            
            gen id = _n 
            
            reshape long @_activity, string i(id) j(WHICH)
            
            replace WHICH = subinstr(WHICH, "_", " ", .)
            
            replace WHICH = subinstr(WHICH, "mod", "moderate", .)
            
            label define which 1 inactive 2 light 3 "moderate to vigorous" 4 vigorous 
            
            encode WHICH, gen(which) label(which) 
            
            gen curt_activity = _activity^(1/3)
            
            stripplot curt_activity, over(which) cumul cumprob  vertical  yla(0 2 "8" 4 "64" 6 "216"  8 "512" 10 "1000" 12 "1728") centre ytitle("Duration (min/week)" "cube root scale" ) xla(, tlc(none)) xtitle(Activity) legend(off) separate(which) ms(Oh Sh Dh Th) mc(stc1 stc2 stc3 stc4) name(tara1)
            
            line curt_activity which, c(L) yla(0 2 "8" 4 "64" 6 "216"  8 "512" 10 "1000" 12 "1728") ytitle("Duration (min/week)" "cube root scale" ) xla(, glp(solid) glw(thin) glc(stc2) valuelabel tlc(none)) xsc(r(0.8 4.2)) xtitle(Activity) name(tara2)
            Click image for larger version

Name:	tara1.png
Views:	1
Size:	41.0 KB
ID:	1761764
            Click image for larger version

Name:	tara2.png
Views:	1
Size:	94.5 KB
ID:	1761765

            Comment


            • #7
              It dawned on me belatedly that the units of min/week are really hiding the fact that these are proportions of people's time spent in various ways. No doubt that should have been obvious.

              Now most graphs show that most people are inactive most of the time. People differ otherwise in how much they are active at different levels.

              We can make some progress by looking at the minority shares. I used tabplot from the Stata Journal.

              The result is manifestly still not ideal in showing the vigorous values, except that they are all really small compared with the others. That could be addressed by using square root or cube root scales, as before.

              Also, the design won't work well if the number of people in the complete dataset is much larger.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float(inactive light_activity mod_to_vigorous_activity vigorous_activity procedure)
              1301.083  95.667  43.167  .083 1
                  1345   72.25  22.583  .167 2
                  1280 122.583  37.333  .083 1
              1082.417 192.667   164.5  .417 2
              1222.083   172.5  45.417     0 1
              1337.583  65.167   37.25     0 .
               1297.75 108.583  33.667     0 2
                1361.5  62.833      15  .667 1
              1302.917  99.833   37.25     0 1
                1238.5  125.75  75.083  .667 2
              1097.917 215.833 124.667 1.583 1
               1162.25 167.667 108.833  1.25 1
              1214.167 184.667  41.167     0 2
               1078.25 142.833 216.083 2.833 1
              1326.583  86.833  26.583     0 1
              1263.417 122.167  53.917    .5 1
                  1368  44.583  26.417     1 2
              1185.667 159.667  93.667     1 1
                1294.5   122.5      23     0 2
                1308.5 113.083  18.417     0 1
              end
              
              rename inactive inactive_activity 
              
              sort procedure light_activity mod_to_vigorous_activity vigorous_activity 
              
              gen id = _n if procedure < . 
              
              reshape long @activity, string i(id) j(WHICH)
              
              replace WHICH = trim(subinstr(WHICH, "_", " ", .))
              
              replace WHICH = subinstr(WHICH, "mod", "moderate", .)
              
              label define which 1 inactive 2 light 3 "moderate to vigorous" 4 vigorous 
              
              encode WHICH, gen(which) label(which) 
              
              sum id if procedure == 1, meanonly 
              local x1 = r(mean)
              
              sum id if procedure == 2, meanonly 
              local x2 = r(mean)
              
              tabplot which id [iw=activity] if which > 1 , separate(procedure) ///
              xla(`x1' "Procedure 1" `x2' "Procedure 2") xtitle("") ytitle("") note("")
              Click image for larger version

Name:	tara3.png
Views:	1
Size:	25.8 KB
ID:	1761816


              Comment


              • #8
                thanks for your input, unfortunately, I have 30 individuals so the graph, despite looking good here won't work in my case, I've resorted to just leave as per initial post...
                Many thanks for your help

                Comment


                • #9
                  There are 19 individuals in the latest graph, so why 30 makes the design a failure is beyond me.

                  Comment

                  Working...
                  X