Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • stripplot updated on SSC

    The single-program package stripplot has been updated on SSC, thanks as usual to Kit Baum. Stata 8.2 is required.

    If interested, install or replace with ssc or adoupdate.

    stripplot started in 1999 as an alternative to graph, oneway (since Stata 8 gr7, oneway) but has since morphed by a mix of accident and design into an alternative to the official command dotplot. The aim is to compare distributions and a variety of displays is possible, including linear and stacked dot and strip plots, with in conjunction box plots or confidence intervals.

    The main changes in this version are

    1. Much extended help file.

    2. Side-by-side quantile plots (or less usefully, in my view, cumulative distribution plots stacked vertically) as an alternative display.

    3. Horizontal reference lines for e.g. means and medians (with a documented limit to this). This detail was sparked by users' comments at the Boston meeting in July on the +++ reference lines allowed with dotplot.

    #2 and #3 are exemplified by this token graph using Stata's citytemp dataset. The reference lines are means in this instance.



    Click image for larger version

Name:	stripplot_new.png
Views:	1
Size:	13.0 KB
ID:	210041



    Here is another example, with Stata's auto data used to show a hybrid box and quantile plot. (Box plots can leave out so much....) The reference lines here are medians.

    Attached Files
    Last edited by Nick Cox; 05 Sep 2014, 08:21.

  • #2
    Thanks to Kit Baum (again), stripplot has been updated on SSC (again).

    Apart from extending the help file, the most obvious changes concern

    1. Adding a cumulative probability option for the cumulated displays (distribution function or quantile function, depending on whether horizontal or vertical alignment is chosen). As the second example in the previous post especially makes clear, cumulative graphs implied previously a frequency scale. Adding a probability scale has the simple advantage that each group of values is shown in about the same space. This permits easy combination with box plots:

    Click image for larger version

Name:	stripplot_box4.png
Views:	1
Size:	24.4 KB
ID:	264575

    The idea (ideal!) is to get the best of both worlds, not only the summarizing function of box plots but also the detail provided by quantile plots (not just possible outliers, but also fine structure such as gaps or granularity that may or may not need scrutiny). Note that there is nothing novel or original in these plots except the implementation: Parzen was urging the use of hybrid quantile-box plots in 1979 and geographers were doing the same thing in spirit in 1933.

    The code for the above example is

    Code:
     
    sysuse auto, clear 
    stripplot mpg, over(rep78) box(barw(0.8) blcolor(ltblue)) centre vertical cumul cumpr mc(orange) scheme(s1color) yla(, ang(h))

    2. Adding an outside option so that stripplot can be used to draw box plots only. There is a detailed example in the help, so I will not illustrate.

    Comment


    • #3
      May I ask a question about strip plot here?

      Assume

      Code:
      sysuse auto
      
      stripplot length cumul cumprob box centre  over(rep78) refline  xsize(3)
      is there any way to sort them from 1 (top) to 5 (bottom) on the rep78 scale? In the "vertical" option case it seems fine, but not horizontal, see my exemplary code.

      ( I need a lot of plots like this and they are all neatly labelled, so just recoding before plotting seem excessive but so far the only option.)

      I seem to be unable to find anything about this in the help file or on the web, I cannot be the only one...

      Thanks in advance!

      Comment


      • #4
        ysc(reverse) is a standard twoway option.

        Code:
        . sysuse auto, clear 
        
        . stripplot length, cumul cumprob box centre  over(rep78) refline  xsize(3)
        
        . stripplot length, cumul cumprob box centre  over(rep78) refline  xsize(3) ysc(reverse)
        The results look fairly odd in this case, but an example in the help for stripplot is more convincing

        Code:
        sysuse bplong, clear 
        
        egen group = group(age sex), label
        
        stripplot bp*, bar over(when) by(group, compact col(1) note("")) ysc(reverse) subtitle(, pos(9) ring(1) nobexpand bcolor(none) placement(e)) ytitle("") xtitle(Blood pressure (mm Hg))

        Comment


        • #5
          Thank you, this helped me very much!

          Comment


          • #6
            Nick is the best

            Comment


            • #7
              Thanks again to Kit Baum, stripplot has been updated on SSC. The immediate stimulus is a bug fix arising from a problem kindly reported by Dave Airey in https://www.statalist.org/forums/for...pplot-question

              The last public date was about 3 years ago. Since then I've been steadily elaborating the help file, by adding references and also examples. The project goes back to 1999, so I now have a fair collection of references, but more are welcome.

              Example code and references for "midgap plots" as discussed at https://www.statalist.org/forums/for...-without-boxes have been added, for example.

              Comment


              • #8
                I have a similar question to above that I cannot answer.

                I want to sort the categorical variables according to their mean score.

                Assume the following
                Code:
                sysuse bplong, clear
                egen group = group(age sex), label
                stripplot bp, bar over(group) ms(none)
                Is it possible to order groups (y axis) by mean blood pressure (x axis) as part of the stripplot command?

                Thanks in advance!
                Attached Files

                Comment


                • #9
                  Something like


                  Code:
                  sysuse bplong, clear
                  egen group = group(age sex), label
                  stripplot bp, bar over(group) ms(none)
                  
                  egen mean = mean(bp), by(group)
                  
                  egen order = group(mean group) 
                  
                  * labmask is from the Stata Journal 
                  labmask order, values(group) decode 
                  
                  stripplot bp, bar over(order) ms(none)

                  Comment


                  • #10
                    This was exactly what I was looking for. Thanks so much!

                    Comment


                    • #11
                      Thanks yet again to Kit Baum, stripplot has been updated on SSC. Although not originally written as such, stripplot is in practice roughly a superset of the official command dotplot, or least based on the same main idea: displays of univariate distributions as marker or point symbols for each value against a magnitude axis.

                      This program goes back to 1999, when I wrote
                      onewplot for Stata 6 as a variation on what was then graph, oneway (now graph7, oneway). The odd program name onewplot is explained by the constraint at the time that no community-contributed command name could be longer than 8 characters (because the operating system MS-DOS only allowed filenames that fitted within a filename.ext format with at most 8 letters in the main part). Then onewayplot was written for Stata 8 and I changed the name to stripplot in 2005.

                      Fast forward to 2021 and what's new in 2.9.0?

                      * The help file has been extended with yet more references.

                      * The examples have been tweaked and the code for all the examples is available as a separate .do file which when run yields 36 example plots. If interested, you should run the do file, which generates all the graphs as named graphs, and then delete one at a time while noting any that might be useful for your own work.

                      * New options have been added to support what I will call Tufte plots in this context. The idea is a minimal version of a box plot showing just a marker for the median and whiskers joining the quartiles to the extremes (minimum and maximum). So, the box is tacit and some might demur at calling it a box plot at all, and indeed Tufte in 1983 called it a quartile plot and others have called it a midgap plot. Names should not matter, except that they do: a good name can be evocative and encouraging, and a poor name can confuse or even condemn a good idea to obscurity. I didn't take much to the idea when I first met it but more recently it has grown on me when used as a adjunct to a more detailed plot. One of several details here is that a box plot can over-emphasise the middle of each distribution when the tails are as or more important.for many problems.

                      Such experiences underpin a frequent observation: you have to work with (play with) a graph design for a while before you can get a good feeling for how and how well it can work. In contrast, a fashionable genre of research -- at present more popular among computer scientists, it seems, that among statisticians, although the latter were doing it in the 1920s -- invites captive audiences to compare graph designs that may be new to them. How quickly a design can be learned is of some interest, but I would rather know how well a design fares with repeated research practice and real data.

                      In https://www.statalist.org/forums/for...-without-boxes I documented how to use
                      stripplot to draw such plots. What is new is being able to do it directly:

                      Code:
                       sysuse auto, clear
                      (1978 automobile data)
                      
                      . set scheme s1color
                      
                      . stripplot mpg , over(foreign) tufte ms(Sh) height(0.2) stack vertical yla(, ang(h)) xla(, noticks)
                      Click image for larger version

Name:	tufte.png
Views:	1
Size:	22.8 KB
ID:	1618414




                      You can also, as usual, modify cosmetic detail, such as the colour of the whiskers and the marker symbol.

                      The example is typical of stripplot experience. The defaults of stripplot are fairly banal and something you could have knocked up yourself from first principles. So

                      Code:
                      stripplot mpg

                      isn't far from

                      Code:
                      gen whatever = 42
                      scatter whatever mpg
                      followed by obscuring the evidence of the response variable. It's the options that give stripplot its flexibility.
                      Last edited by Nick Cox; 11 Jul 2021, 11:28.

                      Comment


                      • #12
                        @
                        Dear Nick,I have updated stripplot,but it is still version 2.8.1
                        Code:
                        . ssc inst stripplot,replace
                        checking stripplot consistency and verifying not already installed...
                        all files already exist and are up to date.
                        
                        . which stripplot
                        /Applications/Stata17/ado/plus/s/stripplot.ado
                        *! 2.8.1 NJC 11 October 2020
                        Code:
                        
                        
                        Best regards.

                        Raymond Zhang
                        Stata 17.0,MP

                        Comment


                        • #13
                          The .pkg is updated, but the files aren't. Sorry about that. I will check with Kit Baum.

                          Comment


                          • #14
                            Files should all be downloadable now.

                            Comment

                            Working...
                            X