Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding xline or yline with "by" option

    I have a twoway plot that I draw for several years. I am trying to draw an xline at the median value for each year. How can I do that.

    As an example

    Code:
    twoway scatter income yrs_education, by (year)
    I want each of the 5 years to have an xline at the median income level for THAT year. So this is xline at a different value for each year- one xline per graph.

    I know I could save the median value in a local and call that but that seems to draw multiple lines in every graph.

    I appreciate the help!


  • #2
    That is a nice question. I would not try xline() to do this, but that is presumably what you tried without success.

    You didn't give a data example, so here is a different one.

    Code:
    sysuse auto, clear
    egen median = median(weight), by(foreign)
    su mpg, meanonly
    gen max = r(max)
    scatter mpg weight, by(foreign, legend(off) note("medians shown by vertical lines")) ///
    || bar max median , xtitle("`: var label weight'")
    Click image for larger version

Name:	xmedian.png
Views:	1
Size:	12.1 KB
ID:	1357666





    Clearly the maximum is empirical, but the code shows one way to get it.

    Note that by default the bar width is 1 which for this example works well.

    For years of education, you would probably need to reach in and tune the bar width to be smaller.

    EDIT: In fact, in your example income is plotted vertically, so you really need horizontal lines, which would be yline() except that doesn't help. Here's example code:

    Code:
    sysuse auto, clear
    egen median = median(mpg), by(foreign)
    su weight, meanonly 
    gen max = r(max) 
    scatter mpg weight, by(foreign, legend(off) note("medians shown by horizontal lines")) ///
    || bar max median , barw(0.1) horizontal ytitle("`: var label mpg'")
    Last edited by Nick Cox; 22 Sep 2016, 12:34.

    Comment


    • #3
      I realised belatedly that twoway spike is a better solution than twoway bar for the extra lines. For example,

      Code:
      sysuse auto, clear
      egen median = median(mpg), by(foreign)
      su weight, meanonly 
      gen max = r(max) 
      scatter mpg weight, by(foreign, legend(off) note("medians shown by horizontal lines")) ///
      || spike max median, horizontal ytitle("`: var label mpg'")

      Comment


      • #4
        Thank you so much! this worked perfectly.

        A small follow-up query. I have the graph by year and I also have a final combined graph for all years at the end. Is there some way to have the "total" graph have either no or one median line corresponding to the overall median?

        Comment


        • #5
          Yes. Temporarily double up the data, and make the copy of the entire dataset a new category, say "Total".

          Code:
          sysuse auto, clear
          preserve
          local Np1 = _N + 1
          expand 2
          replace foreign = 2 in `Np1'/L
          label def origin 2 Total, modify
          egen median = median(mpg), by(foreign)
          su weight, meanonly
          gen max = r(max)
          scatter mpg weight, by(foreign, legend(off) note("medians shown by horizontal lines")) ///
          || spike max median, horizontal ytitle("`: var label mpg'")
          restore
          This is documented at http://www.stata-journal.com/article...article=gr0058

          Comment

          Working...
          X