Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nick Cox
    started a topic stripplot updated on SSC

    stripplot updated on SSC

    The single-program package stripplot has been updated on SSC, thanks as usual to Kit Baum. Stata 8.2 is required.

    If interested, install or replace with ssc or adoupdate.

    stripplot started in 1999 as an alternative to graph, oneway (since Stata 8 gr7, oneway) but has since morphed by a mix of accident and design into an alternative to the official command dotplot. The aim is to compare distributions and a variety of displays is possible, including linear and stacked dot and strip plots, with in conjunction box plots or confidence intervals.

    The main changes in this version are

    1. Much extended help file.

    2. Side-by-side quantile plots (or less usefully, in my view, cumulative distribution plots stacked vertically) as an alternative display.

    3. Horizontal reference lines for e.g. means and medians (with a documented limit to this). This detail was sparked by users' comments at the Boston meeting in July on the +++ reference lines allowed with dotplot.

    #2 and #3 are exemplified by this token graph using Stata's citytemp dataset. The reference lines are means in this instance.



    Click image for larger version

Name:	stripplot_new.png
Views:	1
Size:	13.0 KB
ID:	210041



    Here is another example, with Stata's auto data used to show a hybrid box and quantile plot. (Box plots can leave out so much....) The reference lines here are medians.

    Attached Files
    Last edited by Nick Cox; 05 Sep 2014, 08:21.

  • Laura Cordova
    replied
    Thanks for that! laura

    Leave a comment:


  • Nick Cox
    replied
    The graph in #22 was in #1 but without the syntax, which would be something close to

    Code:
    sysuse auto, clear
    
    stripplot price, over(foreign) pctile(5) box(barw(0.08)) cumul vertical boffset(-0.1) xla(, noticks) refline reflevel(median) yla(0(2500)15000) subtitle(whiskers to 5 and 95% points) ytitle(Price (USD))

    Leave a comment:


  • Laura Cordova
    replied
    Dear Nick Cox
    How do you add the whisker at 5% and 95% for this graph? Thanks very much for your help. Laura
    Last edited by Laura Cordova; 10 Nov 2023, 11:39.

    Leave a comment:


  • Nick Cox
    replied
    Nazzarena #20 #19’The name over() is stolen from its use in other Stata commands but in no other sense Is it a gateway to over() as implemented for say graph bar, graph hbar or graph dot. Indeed stripplot is a wrapper for graph twoway.

    Mark Horowitz it’s a point of principle with me that data elements are drawn last. You can tune the balance by using open markers or smaller markers or thicker lines. See earlier in the thread for other uses of reference lines,

    Leave a comment:


  • Nazzarena
    replied
    Incidentally, I can't seem to make the cat axis label options work with stripplot ...

    sysuse bplong, clear
    stripplot bp*, over(sex, ) vertical msize(tiny) mcolor (navy%2 )

    works normally

    stripplot bp*, over(sex, label(labsize(small) angle(vertical))) vertical msize(tiny) mcolor (navy%2 )
    returns
    over() does not contain a valid varname


    Leave a comment:


  • Nazzarena
    replied
    Hello,

    is there a way to circumvent the limitation of 1 over option? One possibility would be to superimpose another stripplot, but plot/addplot/twoway do not work as stripplot isn't a twoway graph.
    Essentially, I would like to turn very busy box graphs with 2 over dimensions into more readable stripplot (preferred) or dotplot :it seems that using shading, I can approximate graphically a median and IQR without actually indicating it. Even changing colour within one over covariate according to the other (as it appears to be possible in the above examples) would work.
    Any other solution is also welcome. These are matrices of generally 5 to 70 ish (first over covariate) x 2 ish (second over covariate) x whatever number of study participants (say 30ish to 3000ish). When I want to show variability, I use heatmaps with 2 dimensions etc, but sometimes it is preferable to show the three dimensions and emphasize the central tendency and the spread of the dependent variable

    I'm trying to make up a dataset but let me see if I can find something suitable in the classic BP, car, etc data. Will edit

    Thank you so very much

    Leave a comment:


  • Mark Horowitz
    replied
    Nick Cox , what if the reference line was drawn last instead of first? With my own data, I'm finding that many times the reference line is obscured by the markers because it's drawn first. I attached a close example of what I'm seeing using the auto dataset. Second attachment is what it would look like if the reference line was drawn last.

    Code:
    sysuse auto, clear
    xtile mpg_p50 = mpg
    
    stripplot price, over(foreign) refline(lcolor(red)) reflevel(median) vertical ///
    xscale(range(-0.5 1.5)) width(200) height(0.2) stack center separate(mpg_p50) ///
    msymbol(O) mcolor(green maroon) legend(off)
    Attached Files

    Leave a comment:


  • Nick Cox
    replied
    Claims of urgency are not a good idea. There is a hint about this in the FAQ Advice.

    A bigger deal is that I can't play with your problem without example data. If I set up a parallel problem

    Code:
    sysuse auto, clear 
    
    label def rep78 1 "abysmal" 2 "appalling" 3 "adequate" 4 "admirable" 5 "awesome", replace
    label val rep78 rep78 
    
    stripplot rep78, over(foreign) refline(lcolor(erose)) reflevel(mean) tufte(lcolor(eltgreen) mcolor(red) ms(sh)) ///
    ytitle(, size(3)) ysc(titlegap(+4) outergap(-3)) cumul cumprob centre vertical height(0.2) yla(1 "abysmal" 2 "appalling" 3 4 5,  ang(h)) xla(, noticks)
    I can confirm that stripplot loses sight of any value labels. There are circumstances in which this is a feature. but you need that not to happen and the quickest work-around is to specify what you want on the fly, as I've done for two categories in the last command.



    Leave a comment:


  • Ivan Gonzalez
    replied
    Dear Nick Cox ,
    I would kindly appreciate your urgent help with a tiny yet very important and urgent issue for me:

    For some reason my Stripplot is not applying the valuelabel option on my y-axis, my code:

    Note that Treatment is binary(0,1) and h10aCPecon is a nominal variable which indicates if the investment was done in any 4 economic sectors:
    label def h10aCPeconx 1 "SER" 2 "TUR" 3 "AGR" 4 "IND", replace
    label val h10aCPecon h10aCPeconx

    stripplot h10aCPecon, over(Treat) refline(lcolor(erose)) reflevel(mean) tufte(lcolor(eltgreen) mcolor(red) ms(sh)) ytitle(, size(3)) ysc(titlegap(+4) outergap(-3)) cumul cumprob centre vertical height(0.2) yla(0/4, valuelabel ang(h)) xla(, noticks)

    Note that I even tried adding more vsc outergap for the ylabel. But the Stripplot is showing 1 2 3 4 in the y-axis instead of the desired sectors SER TUR AGR IND

    Please note that I just updated the Stripplot with . ssc inst stripplot,replace

    Thank you very much in advance!

    Leave a comment:


  • Nick Cox
    replied
    Now updated on SSC as of 2.9.1 30 January 2022 and with extra material in the help file, thanks to Kit Baum.

    Leave a comment:


  • Nick Cox
    replied
    Files should all be downloadable now.

    Leave a comment:


  • Nick Cox
    replied
    The .pkg is updated, but the files aren't. Sorry about that. I will check with Kit Baum.

    Leave a comment:


  • Raymond Zhang
    replied
    @
    Dear Nick,I have updated stripplot,but it is still version 2.8.1
    Code:
    . ssc inst stripplot,replace
    checking stripplot consistency and verifying not already installed...
    all files already exist and are up to date.
    
    . which stripplot
    /Applications/Stata17/ado/plus/s/stripplot.ado
    *! 2.8.1 NJC 11 October 2020
    Code:
    
    

    Leave a comment:


  • Nick Cox
    replied
    Thanks yet again to Kit Baum, stripplot has been updated on SSC. Although not originally written as such, stripplot is in practice roughly a superset of the official command dotplot, or least based on the same main idea: displays of univariate distributions as marker or point symbols for each value against a magnitude axis.

    This program goes back to 1999, when I wrote
    onewplot for Stata 6 as a variation on what was then graph, oneway (now graph7, oneway). The odd program name onewplot is explained by the constraint at the time that no community-contributed command name could be longer than 8 characters (because the operating system MS-DOS only allowed filenames that fitted within a filename.ext format with at most 8 letters in the main part). Then onewayplot was written for Stata 8 and I changed the name to stripplot in 2005.

    Fast forward to 2021 and what's new in 2.9.0?

    * The help file has been extended with yet more references.

    * The examples have been tweaked and the code for all the examples is available as a separate .do file which when run yields 36 example plots. If interested, you should run the do file, which generates all the graphs as named graphs, and then delete one at a time while noting any that might be useful for your own work.

    * New options have been added to support what I will call Tufte plots in this context. The idea is a minimal version of a box plot showing just a marker for the median and whiskers joining the quartiles to the extremes (minimum and maximum). So, the box is tacit and some might demur at calling it a box plot at all, and indeed Tufte in 1983 called it a quartile plot and others have called it a midgap plot. Names should not matter, except that they do: a good name can be evocative and encouraging, and a poor name can confuse or even condemn a good idea to obscurity. I didn't take much to the idea when I first met it but more recently it has grown on me when used as a adjunct to a more detailed plot. One of several details here is that a box plot can over-emphasise the middle of each distribution when the tails are as or more important.for many problems.

    Such experiences underpin a frequent observation: you have to work with (play with) a graph design for a while before you can get a good feeling for how and how well it can work. In contrast, a fashionable genre of research -- at present more popular among computer scientists, it seems, that among statisticians, although the latter were doing it in the 1920s -- invites captive audiences to compare graph designs that may be new to them. How quickly a design can be learned is of some interest, but I would rather know how well a design fares with repeated research practice and real data.

    In https://www.statalist.org/forums/for...-without-boxes I documented how to use
    stripplot to draw such plots. What is new is being able to do it directly:

    Code:
     sysuse auto, clear
    (1978 automobile data)
    
    . set scheme s1color
    
    . stripplot mpg , over(foreign) tufte ms(Sh) height(0.2) stack vertical yla(, ang(h)) xla(, noticks)
    Click image for larger version

Name:	tufte.png
Views:	1
Size:	22.8 KB
ID:	1618414




    You can also, as usual, modify cosmetic detail, such as the colour of the whiskers and the marker symbol.

    The example is typical of stripplot experience. The defaults of stripplot are fairly banal and something you could have knocked up yourself from first principles. So

    Code:
    stripplot mpg

    isn't far from

    Code:
    gen whatever = 42
    scatter whatever mpg
    followed by obscuring the evidence of the response variable. It's the options that give stripplot its flexibility.
    Last edited by Nick Cox; 11 Jul 2021, 11:28.

    Leave a comment:

Working...
X