Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • subsetplot available on SSC

    Thanks to Kit Baum as usual, a new program subsetplot is now available to download from SSC. Stata 8.2 is required.

    subsetplot produces an array of scatter or other twoway plots for yvarlist versus xvar according to a further variable byvar. There is one plot for observations for each distinct subset of byvar in which data for that subset are highlighted and the rest of the data shown as backdrop. Graphs are drawn individually and then combined with graph combine.

    That's a little abstract, but some examples should help. We all know that if you want to compare relationships graphically between groups of observations, we can superimpose different groups in a single plot, or juxtapose different groups in several plots. This is a hybrid approach combining elements of those two strategies. Consider this code:

    Code:
    set scheme s1color
    sysuse auto, clear
    subsetplot scatter mpg weight, subset(ms(none) mla(rep78) mlabsize(*1.5) mlabpos(0) mlabcolor(blue)) by(rep78)

    Each subset is shown in turn with the rest of the data as backdrop. In the case of ordered categories such as repair record, each value could serve as its own symbol:
    Click image for larger version

Name:	subsetplot_2.png
Views:	1
Size:	36.7 KB
ID:	270405

    Here's one more example. With panel data in particular, the problem of spaghetti plots is pervasive across several fields. In principle, plotting several time series in one plot is showing all the information. In practice, it can be hard to see the trees for the wood, to change the metaphor.

    Code:
     
    webuse grunfeld
    subsetplot line invest year, by(company) ysc(log) yla(1 10 100 1000)
    Click image for larger version

Name:	subsetplot_3.png
Views:	1
Size:	51.1 KB
ID:	270406

    This approach was discussed in Cox (2010). See also Schwabisch (2014) for an example. Readers knowing interesting or useful examples
    or discussions, especially early in date or comprehensive in detail, are welcome to email the author. It's hard to believe that this simple idea doesn't go way back, but at present I lack the references.

    Cox, N.J. 2010. Graphing subsets. Stata Journal 10: 670-681.

    Schwabish, J.A. 2014. An economist's guide to visualizing data. Journal of Economic Perspectives 28: 209-234.

    Attached Files

  • #2
    Seems wonderful way of visualizing data. I tried to install it from ssc, but no luck. Am I missing something?
    ssc install subsetplot
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    www.FinTechProfessor.com
    If you use MS Word, do check my asdoc program that easily sends Stata output to MS Word

    Comment


    • #3
      Great useful stuffs. Many thanks to Nick and Kit Baum. Just to let you know, -ssc describe s- is not showing ''subsetplot'' in the list of the programs yet.
      Roman

      Comment


      • #4
        Thanks for the report. There's a small temporary glitch. The .ado and the .sthlp are there. but not yet a package file. You can copy the files across to a directory or folder of your choice by using ssc copy, or wait for the glitch to be fixed. I'll alert Kit Baum.

        Comment


        • #5
          Glitch now fixed, thanks to Kit. ssc install subsetplot should work (so long as you have sufficient access to the internet, naturally).

          Comment


          • #6
            That's working perfect. Thanks Nick and Kit. Just a query. In the spaghetti plot above, the orange line refers to company specific line for investment over years. But what those grey lines refer to? Is there any way to skip them?
            Roman

            Comment


            • #7
              The entire rationale of subsetplot is to include the rest of the data as backdrop, in this case as a set of grey lines!!! If you don't want that, just use some appropriate official command, e.g line with a by() option, as documented.

              Comment


              • #8
                Actually, this is cleverer than I thought. Brilliant !!
                Roman

                Comment


                • #9
                  Thanks.

                  Comment


                  • #10
                    Thanks Nick,

                    this is very nice new graph!

                    Unfortunately I encounter some problems with value labels of the by() var. (Stata 13.1 MP on Win7)


                    1) A left parenthesis in a value label within the first 32 chars of a value label without a right parenthesis within 32 chars leads to an error.

                    Example:

                    sysuse auto, clear
                    label define foreignlabel3 0 "1 10 20 (manufacturer)" 1 "foreign"
                    label values foreign foreignlabel3
                    subsetplot scatter price mpg,by(foreign)

                    parentheses do not balance
                    r(198);


                    2) Also the use of a comma seems to be misinterpreted.

                    sysuse auto, clear
                    label define foreignlabel3 0 "Detroit, Michigan" 1 "foreign"
                    label values foreign foreignlabel3
                    subsetplot scatter price mpg,by(foreign)

                    option Michigan not allowed
                    r(198);


                    Best wishes

                    Stefan Gawrich
                    Dillenburg
                    Germany





                    Comment


                    • #11
                      The forum -itrim-s text so the first example of my last post worked.

                      Here's an altered example:

                      sysuse auto, clear
                      label define foreignlabel3 0 "1________10________20__(manufacturer)" 1 "foreign", replace
                      label values foreign foreignlabel3
                      subsetplot scatter price mpg,by(foreign)

                      parentheses do not balance
                      r(198);



                      Best wishes

                      Stefan Gawrich
                      Dillenburg
                      Germany

                      Comment


                      • #12
                        Stefan: Thanks for your interest. I can reproduce problem 2 but not (yet?) problem 1. You have unearthed a small bug. I will flag when a fixed version is posted on SSC.

                        Comment


                        • #13
                          Guess both are caused by line 84 of subsetplot.ado which calls the subtitle option as

                          Code:
                          ... subtitle(`which')
                          This should be an easy fix, and I would suggest

                          Code:
                          ... subtitle(`"`macval(which)'"')
                          because macval() is a trick to also deal with single (unmatched) left quotes in labels. Something that cannot be achieved with compound quotes only.


                          By the way, very nice program, Nick. Always happy to read your code for graphic commands, to get and learn from the ideas/technique behind.

                          Best
                          Daniel

                          Comment


                          • #14
                            Thanks Daniel,

                            it works.
                            I should have looked into the code before posting.

                            Thanks again, Nick. I especially like -subsetplot- with line graphs. Very nice.


                            Stefan Gawrich
                            Dillenburg
                            Germany



                            Comment


                            • #15
                              Daniel: I agree with your diagnosis that the subtitle() option needs a fix. I am not going necessarily going to fix it in exactly the same way!

                              Comment

                              Working...
                              X