Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple (two) CDF's one graph

    Hi,

    I am trying to plot two (or more) cdf's in a single graph. I prefer the form of a stairstep connected line and I am trying the following options but I get only one out of the cdf's properly connected.
    The levels also in the y-axis are not correct.

    1- I am getting the line connection properly for both line graphs but not when I try to combine them in a single graph

    cumul x if y==1, gen(cumy1)
    sort cumy1
    line cumy1 x, sort c(J) m(J)
    cumul x if y==0, gen(cumy0)
    sort cumy0
    line cumy0 x, c(J) m(J)
    twoway (line cumy1 x, sort c(J) m(J)) (line cumy0 y, c(J) m(J))

    2- Similar unsuccessful attempt when I try to use "line || line" etc.

    cumul x if y==1, gen(cumy1)
    sort cumy1
    line cumy1 x, sort c(J) m(J)
    cumul x if y==0, gen(cumy0)
    sort cumy0
    line cumy0 x, c(J) m(J)
    line cumy1 x, sort c(J) m(J) || line cumy0 y, c(J) m(J)

    3- Using the "stack" command

    cumul x if y==1, gen(cumy1)
    cumul x if y==0, gen(cumy0)
    stack cumy1 x if y==1 cumy0 x if y==0, into(x y) wide clear
    line cumy1 cumy0 x, c(J J) m(J J) sort ylab(, grid) ytitle("") xlab(, grid)

    I have tried as well to first sort x and y and later on generate the cumulative function with a similar result.

    I also would like to add features as lcolor(black) lpattern(dash) graphregion(color(white)) bgcolor(white) aspect(1), titles of y-axis and x-axis but I am posting above only simple versions of my code as I need to get first the cdf's properly plotted. However, if I am not mistaken, I double the options if "stack" is used i.e. lcolor(black black) lpattern(dash solid) while using separately in the case of a "twoway line" or "line || line" graph type, right ?

    Any particular help on how I can such an unified graph would be appreciated.

    Thank you,

    Sotia

  • #2
    I could explain in detail what is wrong here but I imagine that you will be happier knowing that community-contributed commands have existed to do this for at least 20 years. I will write about the one I know best. distplot is downloadable from the Stata Journal website.

    Typing

    Code:
    search distplot, historical

    will yield in an up-to-date Stata

    Code:
    SJ-19-1 gr41_5  . . . . . . . . . . . . . . . . . Software update for distplot
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q1/19   SJ 19(1):260
            changes include better handling of the by() option calls;
            simpler default y-axis titles; more detailed discussion of
            exactly what is plotted; and more information on ridits
    
    SJ-10-1 gr41_4  . . . . . . . . . . . . . . . . . Software update for distplot
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q1/10   SJ 10(1):164
            new reverse(ge) option specifies plotting probabilities or
            frequencies greater than or equal to any data value
    
    SJ-5-3  gr0018  . . . . . . . . . .  Speaking Stata: The protean quantile plot
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q3/05   SJ 5(3):442--460           (see gr41_3 and gr42_3 for commands)
            discusses quantile and distribution plots as used in
            the analysis of species abundance data in ecology
    
    SJ-5-3  gr41_3  . . . . . . . . . . . . . . . . . Software update for distplot
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q3/05   SJ 5(3):471
            simplified syntax; both by() and over() are now allowed
    
    SJ-4-2  gr0004  .  Speaking Stata: Graphing categorical and compositional data
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q2/04   SJ 4(2):190--215                                 (no commands)
            discusses graphical possibilities for categorical and
            compositional data
    
    SJ-4-1  gr0003  . . . . . . . . . . . . Speaking Stata: Graphing distributions
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q1/04   SJ 4(1):66--88                                   (no commands)
            a review of official and user-written commands for
            graphing univariate distributions; includes tricks
            beyond what is obviously and readily available
    
    SJ-3-4  gr41_2  . . . . . . . . . . . . . . . . . Software update for distplot
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q4/03   SJ 3(4):449
            option tscale() renamed as trscale()
    
    SJ-3-2  gr41_1  . . . . . . . . . . . . . . . . . Software update for distplot
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q2/03   SJ 3(2):211
            enhanced to use Stata 8 graphics and provides new options
    
    STB-51  gr41  . . . . . . . . . . . . . . . . . .  Distribution function plots
            (help distplot if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            9/99    pp.12--16; STB Reprints Vol 9, pp.108--112
            plots the cumulative distribution function or survival function
            and allows multiple variables
    The articles from 1999 and 2004 may still be of interest -- the key principles here go back many decades -- and are directly accessible at

    https://www.stata.com/products/stb/journals/stb51.pdf

    https://www.stata-journal.com/articl...article=gr0003

    but software (do and help) should be downloaded from the latest source, as I write Stata Journal 19(1) 2019. Note particularly that the syntax has changed over time.


    Here's a token example with some technique.

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . distplot mpg, over(foreign) scheme(s1color) c(J J) lc(red blue) yla(0 "0" 1 "1" 0.2(0.2)0.8, format ("%02.1f") ang(h))

    Click image for larger version

Name:	distplot.png
Views:	1
Size:	22.8 KB
ID:	1504178



    Although I've kept tinkering with this, I much prefer quantile plots for most applications, for which qplot (Stata Journal) is one vehicle and stripplot (SSC) is another.

    Last edited by Nick Cox; 21 Jun 2019, 00:22.

    Comment


    • #3
      Dear Nick Cox is it possible to add the Theoretical CDF in the distplot in the above example? I looked for the same using Ben Jann 's dstat as well. This further does not allow by option with over for CDF. I am using STATA 18.


      Thank you - Mukesh
      Best regards,
      Mukesh

      Comment


      • #4
        Absolutely. What is likely to be most useful is the scope to use addplot() to call up twoway function.

        Code:
        sysuse auto, clear 
        distplot mpg, c(J)  addplot(function normal((x - 21.3) /  5.786), ra(mpg)) legend(order(1 "data" 2 "normal")) xtitle("`: var label mpg'")
        gives a plot of the cumulative normal with the same mean and SD as the data.

        That said, a dedicated quantile-quantile plot is immensely more advisable here.

        Comment


        • #5
          Thank you Nick Cox for your response!

          My issue is adding normal with by option;

          Code:
            distplot y,  by( x1)
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input float y byte x float x1
          -1.33 1 1
          -1.41 1 1
           -1.9 1 1
            .67 2 1
            1.4 2 1
           2.09 2 1
          -1.83 2 1
           -.44 1 1
            .19 2 1
           -2.2 2 1
          -1.21 1 1
            .39 2 1
           -.94 1 1
           -.78 1 1
           -1.8 2 1
           1.69 1 1
            .24 2 1
           -.86 1 1
            .33 1 1
           -.57 1 1
          -1.63 2 1
           2.61 2 1
            1.2 2 1
            .08 2 1
            .11 2 1
           -.22 2 1
          -3.69 1 1
          -1.43 1 1
            1.4 1 1
            .96 2 1
            .68 1 1
           -.17 2 1
            -.7 1 1
          -1.06 2 1
            .84 1 1
          -1.93 2 1
            .81 1 1
           1.91 1 1
          -1.13 1 1
            .08 2 2
          -1.31 1 2
           -.15 2 2
            .21 2 2
          -2.45 1 2
          -1.08 2 2
           5.88 2 2
            .11 2 2
           1.34 2 2
           -.02 1 2
          -2.16 1 2
            .19 1 2
           1.98 2 2
           2.58 1 2
           2.13 1 2
          -2.08 2 2
              0 1 2
            4.1 1 2
          -2.86 2 2
          -2.49 2 2
          -2.07 1 2
           -2.4 2 2
          -1.86 1 2
          -1.95 1 2
            .95 1 2
           -.51 2 2
          -2.13 2 2
          -1.86 1 2
           -.41 1 2
           -4.2 1 2
          -1.99 1 2
            .14 2 1
           -.98 1 1
             -2 2 1
           -.99 2 1
           5.49 1 1
            -.8 1 1
           1.29 1 1
           -.65 2 1
          -1.42 1 1
          -1.06 2 1
          -1.67 2 1
            .96 1 1
          -1.33 2 1
           -2.4 1 1
            .04 2 1
           -.97 2 1
           -.49 2 1
           1.05 1 1
          -1.64 2 1
            1.6 2 1
            .13 2 1
           -1.2 2 1
          -2.51 2 1
            .27 1 1
           -.83 2 1
          -1.26 2 1
          -1.09 1 1
            .32 2 1
          -3.37 2 1
          -1.52 2 1
          end
          Best regards,
          Mukesh

          Comment


          • #6
            So what does that mean in detail: the same mean and SD, different mean and SD? Or?

            Comment


            • #7
              Nothing stops you adding two curves to a distplot, but as before I'd recommend quantile-normal plots. Here I use qplot from the Stata Journal, a scale on which normal distributions would plot on straight lines and added display of skewness and kurtosis.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float y byte x float x1
              -1.33 1 1
              -1.41 1 1
               -1.9 1 1
                .67 2 1
                1.4 2 1
               2.09 2 1
              -1.83 2 1
               -.44 1 1
                .19 2 1
               -2.2 2 1
              -1.21 1 1
                .39 2 1
               -.94 1 1
               -.78 1 1
               -1.8 2 1
               1.69 1 1
                .24 2 1
               -.86 1 1
                .33 1 1
               -.57 1 1
              -1.63 2 1
               2.61 2 1
                1.2 2 1
                .08 2 1
                .11 2 1
               -.22 2 1
              -3.69 1 1
              -1.43 1 1
                1.4 1 1
                .96 2 1
                .68 1 1
               -.17 2 1
                -.7 1 1
              -1.06 2 1
                .84 1 1
              -1.93 2 1
                .81 1 1
               1.91 1 1
              -1.13 1 1
                .08 2 2
              -1.31 1 2
               -.15 2 2
                .21 2 2
              -2.45 1 2
              -1.08 2 2
               5.88 2 2
                .11 2 2
               1.34 2 2
               -.02 1 2
              -2.16 1 2
                .19 1 2
               1.98 2 2
               2.58 1 2
               2.13 1 2
              -2.08 2 2
                  0 1 2
                4.1 1 2
              -2.86 2 2
              -2.49 2 2
              -2.07 1 2
               -2.4 2 2
              -1.86 1 2
              -1.95 1 2
                .95 1 2
               -.51 2 2
              -2.13 2 2
              -1.86 1 2
               -.41 1 2
               -4.2 1 2
              -1.99 1 2
                .14 2 1
               -.98 1 1
                 -2 2 1
               -.99 2 1
               5.49 1 1
                -.8 1 1
               1.29 1 1
               -.65 2 1
              -1.42 1 1
              -1.06 2 1
              -1.67 2 1
                .96 1 1
              -1.33 2 1
               -2.4 1 1
                .04 2 1
               -.97 2 1
               -.49 2 1
               1.05 1 1
              -1.64 2 1
                1.6 2 1
                .13 2 1
               -1.2 2 1
              -2.51 2 1
                .27 1 1
               -.83 2 1
              -1.26 2 1
              -1.09 1 1
                .32 2 1
              -3.37 2 1
              -1.52 2 1
              end
              
              egen skew = skew(y), by(x1)
              egen kurt = kurt(y), by(x1)
              
              gen toshow = "skew = " + strofreal(skew, "%3.2f") + ", kurt = " + strofreal(kurt, "%3.2f") 
              
              gen wherey = 6
              gen wherex = 0.5
              qplot y, by(x1, legend(off)) trscale(invnormal(@)) yla(-5/5) addplot(scatter wherey wherex, ms(none) mlabel(toshow) mlabpos(0) mlabsize(medium) )
              Click image for larger version

Name:	qplot_skewkurt.png
Views:	1
Size:	52.2 KB
ID:	1778932

              Comment

              Working...
              X