Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • -stripplot- package updated on SSC: new command -strip-

    Thanks as ever to Kit Baum, the stripplot package has been updated. The update consists of a new command strip, which requires Stata 9 or higher.

    stripplot is still included -- I don't want to break any scripts that might be using it -- but stripplot has reached the end of its road, and I won't enhance it in future (although I will try to fix any further bugs reported).

    strip is stripplot reduced in code complexity. So why did I do that?

    stripplot was born as onewplot in 1999 (the rather odd name had to fit within the filename.ext pattern for filenames in MS-DOS -- filenames could be shorter, but not longer). It was renamed as onewayplot in 2003 on being rewritten for Stata 8, and then renamed stripplot in 2005.

    stripplot has accumulated options over the years, but I becaime dissatisfied with it for various reasons. The options now removed in the cut-down version strip concerned extras: adding confidence interval bars, boxplots of various styles, Tufte-style quartile or midgap plots. and reference lines. The loss of functionality is much less than might be guessed, as you can still add details to the graphs through its addplot() option.

    Confidence interval calculations in stripplot were geared to ci as it existed before Stata 14.1, when the syntax was changed. An updafe of same kind was overdue. It may seem odd that the update consisted of removing the options, but what stripplot can do (and more) is so far as I am concerrned better done by reversing the order of operations.

    cisets from SSC (and the Stata Journal from 26(2) -- the next issue, due around June) produces confidence interval sets for various summary measures, after which confidence intervals can be plotted directly. https://www.statalist.org/forums/for...-interval-sets explains and exemplifies in more detail, as does that paper forthcoming in the Stata Journal.

    The other options that have been removed were not outdated, but the main reason for removing them was a sense that the syntax had become too complicated. The help file still includes a large number of references in its territory, largely because of a personal habit of using help files as my notes on projects based on Stata commands, some of which get written up in due course as papers in the Stata Journal. If I think up new options or new examples or encounter new references, they tend to get added to the help file.

    The help for strip includes the code for 30 graph examples, which can be run directly using the ancillary file strip_examples.do

    You can install strip by installing the stripplot package for the first time or updating your installation if you've installed it before.

    Code:
    ssc install stripplot 
    
    ssc install stripplot, replace
    Here is a sampler of graphs from
    strip.

    Click image for larger version

Name:	STRIP7.png
Views:	1
Size:	48.7 KB
ID:	1785587
    Click image for larger version

Name:	STRIP19.png
Views:	1
Size:	44.3 KB
ID:	1785588
    Click image for larger version

Name:	STRIP21.png
Views:	1
Size:	49.1 KB
ID:	1785589
    Click image for larger version

Name:	STRIP27.png
Views:	1
Size:	64.3 KB
ID:	1785590
    Click image for larger version

Name:	STRIP30.png
Views:	1
Size:	49.0 KB
ID:	1785591

  • #2
    I've been alerted to a question at https://www.reddit.com/r/stata/comme...tat_help_reqd/

    The OP there evidently wants two median-quartile boxes with different colours. They don't give a data example I can use, but this gives some flavour that can be adapted according to taste and circumstance.

    Code:
    clear
    sysuse auto
    
    egen med = median(mpg), by(foreign)
    egen max = max(mpg), by(foreign)
    egen min = min(mpg), by(foreign)
    egen p25 = pctile(mpg), p(25) by(foreign)
    egen p75 = pctile(mpg), p(75) by(foreign)
    
    gen foreign2 = foreign + 0.5
    
    local colour1 red
    local colour2 blue
    local standard fcolor(none) barw(0.12)
    
    #delimit ;
    strip mpg, over(foreign) stack height(0.3) vertical
    separate(foreign) ms(S ..) msize(medium ..) mcolor(`colour1' `colour2') legend(off)
    xla(, ang(-0.001) tlength(*2) tlc(none))
    addplot(
       rbar p25 med foreign2 if !foreign, lcolor(`colour1') `standard'
    || rbar p75 med foreign2 if !foreign, lcolor(`colour1') `standard'
    || rbar p25 med foreign2 if foreign, lcolor(`colour2')  `standard'
    || rbar p75 med foreign2 if foreign, lcolor(`colour2')  `standard'
    || rspike p75 max foreign2 if !foreign, lcolor(`colour1')
    || rspike p75 max foreign2 if foreign, lcolor(`colour2')
    || rspike p25 min foreign2 if !foreign, lcolor(`colour1')
    || rspike p25 min foreign2 if foreign, lcolor(`colour2'))
    ;
    Click image for larger version

Name:	strip_twoboxes.png
Views:	1
Size:	33.4 KB
ID:	1785798



    What would usually or often be changed:

    * variable names of outcome and grouping variable

    * the height (here width) assigned to the stripplot display

    * marker symbol and size

    * the colours

    * code that depends on the grouping variable being 0, 1

    My stance (yours may vary) is that if all the data are shown in the stripplot you don't need to bother with arbitrary stuff depending on whether data points are or are not more than 1.5 IQR from the nearer quartile. Just draw spikes to paired percentiles, which could be the minimum and maximum, but need not be.

    As the Reddit thread points out, starting with graph box is not going to help here.

    Comment

    Working...
    X