Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Twoway bar: fine-tuning and troubleshooting inner plot region

    Hello, I am graphing the following simulation data, to show pesticide residues on a food.
    I have three questions about the resulting graph:
    1. On the right-most bar (DDT), each data point (0.5, 0.8, 1.4 and 2.3) is marked as a horizontal line. I like that detail, and would like that to happen for the other two bars (Aldrin and BHC), but am not sure why that inconsistency has occurred. How can I modify the code to achieve this?

    2. Range of bar: I'd like to display the range of the data (i.e. 0.5 to 2.3 for DDT, rather than 0 to 2.3). Is that possible?

    3. Add marker for median. How can I add a marker for the median of each category of pesticide?


    clear
    input chem conc mrl
    1 .2 .5
    1 .3 .5
    1 .8 .5
    1 2.5 .5
    2 1 .2
    2 1.2 .2
    2 1.8 .2
    2 .05 .2
    3 2.3 1.3
    3 1.4 1.3
    3 .8 1.3
    3 .5 1.3
    end
    label define chem 1 "Aldrin" 2 "BHC" 3 "DDT"
    label values chem chem

    twoway (bar conc chem, color(gray) lcolor(black) barwidth(0.05)) ///
    (dot chem mrl, horizontal msymbol(D) ndots(1) color(black) lcolor(black) legend(label(1 "Concentration"))), ///
    xlabel(1 2 3, valuelabel angle(vertical)) ///
    title(Cucumbers: pesticide residues) ///
    xtitle(Pesticide identified) ///
    ytitle(Residue concentration in mg/kg) ///
    legend(order(1 "Concentration" 2 "Legal limit")) ///
    note(Notes 1. Black diamond indicates MRL. MRL=Maximum Residue Limit)


  • #2
    one possibility:

    Code:
    clear
    input chem conc mrl
    1 .2 .5
    1 .3 .5
    1 .8 .5
    1 2.5 .5
    2 1 .2
    2 1.2 .2
    2 1.8 .2
    2 .05 .2
    3 2.3 1.3
    3 1.4 1.3
    3 .8 1.3
    3 .5 1.3
    0.5 . .
    3.5 . .
    end
    label define chem 1 "Aldrin" 2 "BHC" 3 "DDT"
    label values chem chem
    
    bys chem : egen minc = min(conc)
    bys chem : egen maxc = max(conc)
    bys chem : egen medc = median(conc)
    
    twoway ///
      (rspike minc maxc chem, lcolor(blue) horizontal) ///
      (rcap medc medc chem, lcolor(blue) msize(vlarge) horizontal) ///
      (sc chem conc, ms(Oh) mc(blue)) ///
      (sc chem mrl, ms(O) mc(black)) ///
      , ylabel(1 "Aldrin" 2 "BHC" 3 "DDT") ///
      title(Cucumbers: pesticide residues) ///
      ytitle(Pesticide identified) ///
      xtitle(Residue concentration in mg/kg) ///
      xlabel(0/3) ///
      legend(row(1)) legend(order(1 "Concentration Range" 3 "Concentration" 4 "Legal limit"))
    Last edited by Edwin Leuven; 19 Apr 2015, 12:50.

    Comment


    • #3
      Michael: Your bars are drawn in sequence. In the case of DDT, the values decrease, so shorter bars are shown on top of longer. In the other cases, that is not so. So you could tune your sequence.
      I agree, however, with Edwin's suggestions. These are point values and better shown as such. I would tweak Edwin's code to put the y axis labels horizontal and lose the ticks.
      Code:
        ylabel(1 "Aldrin" 2 "BHC" 3 "DDT", ang(h) noticks)

      Comment


      • #4
        Edwin, and Nick, thanks. How would I change the legend code to include defining the median?
        I've attempted this by rewriting the line as "legend(order(1 "Concentration Range" 3 "Concentration" 4 "Legal limit"))", however it returns an elongated "H" rather than the single median point, as attached (png file).
        Attached Files

        Comment


        • #5
          Yes, because rcap shows a range, and the fact that the range is 0 doesn't affect the legend.

          One possibility is just to use a marker symbol.

          Code:
          clear
          input chem conc mrl
          1 .2 .5
          1 .3 .5
          1 .8 .5
          1 2.5 .5
          2 1 .2
          2 1.2 .2
          2 1.8 .2
          2 .05 .2
          3 2.3 1.3
          3 1.4 1.3
          3 .8 1.3
          3 .5 1.3
          0.5 . .
          3.5 . .
          end
          label define chem 1 "Aldrin" 2 "BHC" 3 "DDT"
          label values chem chem
          
          bys chem : egen minc = min(conc)
          bys chem : egen maxc = max(conc)
          bys chem : egen medc = median(conc)
          
          twoway ///
            (rspike minc maxc chem, lcolor(blue) horizontal) ///
            (sc chem medc, ms(Dh) msize(large) mc(magenta)) ///
            (sc chem conc, ms(Oh) mc(blue)) ///
            (sc chem mrl, ms(Th) mc(red) msize(large)) ///
            , ylabel(1 "Aldrin" 2 "BHC" 3 "DDT", ang(h) noticks) ///
            title(Cucumbers: pesticide residues) ///
            ytitle(Pesticide identified) ///
            xtitle(Residue concentration in mg/kg) ///
            xlabel(0/3) ///
            legend(row(1)) legend(order(2 "Median" 4 "Legal limit"))

          Comment


          • #6
            Thank you Nick, this is working well. In applying the simulation to my original data (which are identical in structure), I've run into two puzzling issues which I haven't been able to solve in studying the help for -twoway-.
            1. If there are >15 observations, the graph can't display all rows. Can the graph be "stretched" vertically?
            2. If a specific pesticide is missing for a given plant sample, the label is still displayed. And, I have 81 different pesticides and 85 different plant samples in the dataset, with numerous pesticide/plant pairs missing because they weren't detected in analysis. Thus, I'll be reporting 85 graphs in my report. Is there a way to instruct -twoway- to display only the labels for pesticides detected, without having to manually specify - and alphabetize - the existing pesticides detected on each of the plants?
            3. Can the -xlabel- be specified to display labels over whatever range exists in the data?

            Code for simulating and depicting these issues is below:

            clear
            input chem conc mrl
            1 .2 .5
            1 .3 .5
            1 .8 .5
            1 2.5 .5
            2 1 .2
            2 1.2 .2
            2 1.8 .2
            2 .05 .2
            3 2.3 1.3
            3 1.4 1.3
            3 .8 1.3
            3 .5 1.3
            4 . .
            4 . .
            4 . .
            5 .2 .9
            5 .3 .9
            5 .7 .9
            6 .8 1.4
            6 .6 1.4
            6 2.2 1.4
            7 4.8 2
            7 2.1 2
            7 1.5 2
            8 15 2
            8 3 2
            8 .5 2
            9 . .
            9 . .
            9 . .
            10 2 .8
            10 3 .8
            10 .1 .8
            11 4 1.6
            11 2 1.6
            11 3 1.6
            12 2 .8
            12 3 .8
            12 4 .8
            13 2 3
            13 2 3
            13 6 3
            14 4 1.6
            14 2 1.6
            14 3 1.6
            15 2 .8
            15 3 .8
            15 4 .8
            16 2 3
            16 2 3
            16 6 3

            end
            label define chem 1 "Aldrin" 2 "BHC" 3 "DDT" 4 "Chlorpyrifos" 5 "Endrin" ///
            6 "Sulfotep" 7 "Simazine" 8 "Phoxim" 9 "Permethrin" 10 "Parathion" 11 "Oxychlordane" ///
            12 "Fonofos" 13 "Heptachlor" 14 "BHC" 15 "Bendiocarb" 16 "Bifenthrin"
            label values chem chem

            bys chem : egen minc = min(conc)
            bys chem : egen maxc = max(conc)
            bys chem : egen medc = median(conc)
            l, noobs clean

            twoway ///
            (rspike minc maxc chem, lcolor(black) horizontal) ///
            (sc chem medc, ms(X) msize(huge) mc(black)) ///
            (sc chem conc, ms(ph) mc(black) msize(small)) ///
            (sc chem mrl, ms(T) mc(black) msize(large)) ///
            , ylabel(, ang(h) noticks) ///
            title(Chrysanthemum: pesticide residues) ///
            ytitle(Pesticide identified) ///
            xtitle(Residue concentration in mg/kg) ///
            xlabel(0/3) ///
            ylabel(1 "Aldrin" 2 "BHC" 3 "DDT" 4 "Chlorpyrifos" 5 "Endrin" ///
            6 "Sulfotep" 7 "Simazine" 8 "Phoxim" 9 "Permethrin" 10 "Parathion" ///
            11 "Oxychlordane" 12 "Fonofos" 13 "Heptachlor" 14 "BHC" 15 "Bendiocarb" 16 "Bifenthrin" ///
            , ang(h) noticks) ///
            legend(row(1)) legend(order(1 "Concentration range" 4 "Legal limit" 2 "Median concentration" )) ///
            note(Notes 1. Black diamond indicates MRL. MRL=Maximum Residue Limit)

            Comment


            • #7
              Thanks for the reproducible example. In terms of your three questions:

              1. If there are >15 observations, the graph can't display all rows. Can the graph be "stretched" vertically?

              I don't understand the first part. Otherwise, check out the ysize() option.

              2. If a specific pesticide is missing for a given plant sample, the label is still displayed. And, I have 81 different pesticides and 85 different plant samples in the dataset, with numerous pesticide/plant pairs missing because they weren't detected in analysis. Thus, I'll be reporting 85 graphs in my report. Is there a way to instruct twoway to display only the labels for pesticides detected, without having to manually specify - and alphabetize - the existing pesticides detected on each of the plants?

              3. Can the xlabel() option be specified to display labels over whatever range exists in the data?

              With your dataset as above, these can be addressed as follows.

              Code:
               
              egen gchem = group(chem) if conc < ., label 
              
              su gchem 
              local ymax = r(max)
              su conc 
              local xmax = ceil(r(max)) 
              local step = 1 
              
              twoway ///
              (rspike minc maxc gchem, lcolor(black) horizontal) ///
              (sc gchem medc, ms(X) msize(huge) mc(black)) ///
              (sc gchem conc, ms(ph) mc(black) msize(small)) ///
              (sc gchem mrl, ms(T) mc(black) msize(large)) ///
              , ylabel(1/`ymax', val ang(h) noticks) ///
              title(Chrysanthemum: pesticide residues) ///
              ytitle(Pesticide identified) ///
              xtitle(Residue concentration in mg/kg) ///
              xlabel(0(`step')`xmax') ///
              legend(row(1)) legend(order(1 "Concentration range" 4 "Legal limit" 2 "Median concentration" )) ///
              note(Notes 1. Black diamond indicates MRL. MRL=Maximum Residue Limit)
              We need to map pesticides to 1 up in the same order as chem but conditionally on there being non-missing values. That is what egen's group() function can do. Naturally you need to copy the labels too.

              To customize x axis labels, the first step is to find out the empirical maximum. In the example above you get labels at 0(1)20 which is already rather busy. You could easily customise labels further, by changing the local macro step. Stata's defaults will also work well for data like these.


              Comment


              • #8
                Thank you very much, for pointing out how to use -group- and -xlabel step. I'm going to implement these with my data.

                Comment


                • #9
                  Nick, the updated -twoway- graph returns the following error which appears related to the -local ymax...- command, even though no error was returned when that command was run preceding the graph.

                  (note: named style ph not found in class symbol, default attributes used)
                  invalid label specifier, : 1/:
                  r(198);



                  Comment


                  • #10
                    Please show us the exact code that you typed (FAQ Advice Section 12).

                    Comment


                    • #11
                      Nick, the exact code is below, and in copying it to here I discovered my error, which was in issuing the -local- commands and -twoway- as separate do-file runs. When I run them together, the graph is produced, but I have a follow-up question: is it possible to have the pesticides identified in the yaxis alphabetized?


                      * SIMULATION
                      clear
                      input chem conc mrl
                      1 .2 .5
                      1 .3 .5
                      1 .8 .5
                      1 2.5 .5
                      2 1 .2
                      2 1.2 .2
                      2 1.8 .2
                      2 .05 .2
                      3 2.3 1.3
                      3 1.4 1.3
                      3 .8 1.3
                      3 .5 1.3
                      4 . .
                      4 . .
                      4 . .
                      5 .2 .9
                      5 .3 .9
                      5 .7 .9
                      6 .8 1.4
                      6 .6 1.4
                      6 2.2 1.4
                      7 4.8 2
                      7 2.1 2
                      7 1.5 2
                      8 15 2
                      8 3 2
                      8 .5 2
                      9 . .
                      9 . .
                      9 . .
                      10 2 .8
                      10 3 .8
                      10 .1 .8
                      11 4 1.6
                      11 2 1.6
                      11 3 1.6
                      12 2 .8
                      12 3 .8
                      12 4 .8
                      13 2 3
                      13 2 3
                      13 6 3
                      14 4 1.6
                      14 2 1.6
                      14 3 1.6
                      15 2 .8
                      15 3 .8
                      15 4 .8
                      16 2 3
                      16 2 3
                      16 6 3
                      end

                      * DATA PREP
                      label define chem 1 "Aldrin" 2 "BHC" 3 "DDT" 4 "Chlorpyrifos" 5 "Endrin" ///
                      6 "Sulfotep" 7 "Simazine" 8 "Phoxim" 9 "Permethrin" 10 "Parathion" 11 "Oxychlordane" ///
                      12 "Fonofos" 13 "Heptachlor" 14 "BHC" 15 "Bendiocarb" 16 "Bifenthrin"
                      label values chem chem

                      bys chem : egen minc = min(conc)
                      bys chem : egen maxc = max(conc)
                      bys chem : egen medc = median(conc)
                      l, noobs clean

                      * TWOWAY

                      egen gchem = group(chem) if conc < ., label

                      su gchem
                      local ymax = r(max)
                      su conc
                      local xmax = ceil(r(max))
                      local step = 1
                      twoway ///
                      (rspike minc maxc gchem, lcolor(black) horizontal) ///
                      (sc gchem medc, ms(X) msize(huge) mc(black)) ///
                      (sc gchem conc, ms(ph) mc(black) msize(small)) ///
                      (sc gchem mrl, ms(T) mc(black) msize(large)) ///
                      , ylabel(1/`ymax', val ang(h) noticks) ///
                      title(Chrysanthemum: pesticide residues) ///
                      ytitle(Pesticide identified) ///
                      xtitle(Residue concentration in mg/kg) ///
                      xlabel(0(`step')`xmax') ///
                      legend(row(1)) legend(order(1 "Concentration range" 4 "Legal limit" 2 "Median concentration" )) ///
                      note(Notes 1. Black diamond indicates MRL. MRL=Maximum Residue Limit)

                      Comment


                      • #12
                        You can do it alphabetically with some extra data management. This little bit is different. Map the numeric variable with value labels to a string variable and then sort on that, conditionally on there being a measurement to plot.

                        Code:
                        decode chem, gen(schem)
                        egen gchem = group(schem) if conc < ., label
                        The problem with pH can be traced to your typo where I had Dh (posts #5 to #6).

                        Please use CODE mark-up. It takes two minutes to learn. All you need is in the FAQ Advice.

                        Comment


                        • #13
                          Thanks Nick, I'm using CODE markup now. The sorting step makes sense, however it's appearing in reverse alphabetic sort format, even if I change
                          Code:
                          sort gchem
                          to
                          Code:
                          gsort -chem
                          ,
                          or if I insert sort as an option in each of the subgraph statements. Any suggestions on a solution or workaround would be gratefully appreciated!



                          Code:
                          * SIMULATION
                          clear
                          input chem conc mrl
                          1 .2 .5
                          1 .3 .5
                          1 .8 .5
                          1 2.5 .5
                          2 1 .2
                          2 1.2 .2
                          2 1.8 .2
                          2 .05 .2
                          3 2.3 1.3
                          3 1.4 1.3
                          3 .8 1.3
                          3 .5 1.3
                          4 . .
                          4 . .
                          4 . .
                          5 .2 .9
                          5 .3 .9
                          5 .7 .9
                          6 .8 1.4
                          6 .6 1.4
                          6 2.2 1.4
                          7 4.8 2
                          7 2.1 2
                          7 1.5 2
                          8 15 2
                          8 3 2
                          8 .5 2
                          9 . .
                          9 . .
                          9 . .
                          10 2 .8
                          10 3 .8
                          10 .1 .8
                          11 4 1.6
                          11 2 1.6
                          11 3 1.6
                          12 2 .8
                          12 3 .8
                          12 4 .8
                          13 2 3
                          13 2 3
                          13 6 3
                          14 4 1.6
                          14 2 1.6
                          14 3 1.6
                          15 2 .8
                          15 3 .8
                          15 4 .8
                          16 2 3
                          16 2 3
                          16 6 3
                          end
                          
                          * DATA PREP
                          label define chem 1 "Aldrin" 2 "BHC" 3 "DDT" 4 "Chlorpyrifos" 5 "Endrin" ///
                          6 "Sulfotep" 7 "Simazine" 8 "Phoxim" 9 "Permethrin" 10 "Parathion" 11 "Oxychlordane" ///
                          12 "Fonofos" 13 "Heptachlor" 14 "BHC" 15 "Bendiocarb" 16 "Bifenthrin"
                          label values chem chem
                          
                          bys chem : egen minc = min(conc)
                          bys chem : egen maxc = max(conc)
                          bys chem : egen medc = median(conc)
                          l, noobs clean
                          
                          * TWOWAY
                          
                          egen gchem = group(chem) if conc < ., label
                          su gchem
                          local ymax = r(max)
                          su conc
                          local xmax = ceil(r(max))
                          local step = 1
                          twoway ///
                          (rspike minc maxc gchem, lcolor(black) horizontal) ///
                          (sc gchem medc, ms(X) msize(huge) mc(black)) ///
                          (sc gchem conc, ms(ph) mc(black) msize(small)) ///
                          (sc gchem mrl, ms(T) mc(black) msize(large)) ///
                          , ylabel(1/`ymax', val ang(h) noticks) ///
                          title(Chrysanthemum: pesticide residues) ///
                          ytitle(Pesticide identified) ///
                          xtitle(Residue concentration in mg/kg) ///
                          xlabel(0(`step')`xmax') ///
                          legend(row(1)) legend(order(1 "Concentration range" 4 "Legal limit" 2 "Median concentration" )) ///
                          note(Notes 1. Black diamond indicates MRL. MRL=Maximum Residue Limit) ///
                          scale(0.7)

                          Comment


                          • #14
                            Code:
                             
                            ysc(reverse)

                            Comment


                            • #15
                              Much appreciated! Works like a charm.

                              Comment

                              Working...
                              X