Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Twoway spline with multiple variables

    Hi All,

    I am currently using a twoway connected graph, with multiple variables as follows:


    Code:
    twoway connected x1 x2 x3 year
    However, this graph is all over the place, as the values of x1, x2 and x3 change a lot. As such, I wish to smooth it. The twoway mspline command is what I am looking for, but it does not allow me to graph multiple variables. Is there any option that will allow me to do this?

    Thanks,
    CS

  • #2
    Perhaps you are looking for something like this:
    Code:
    sysuse auto
    twoway scatter mpg weight, msize(*.5) || mspline mpg weight || scatter price weight, yaxis(2) || mspline price weight, yaxis(2)
    HTH
    Fernando

    Comment


    • #3
      Fernando has expanded on one of the examples shown under -help twoway spline-. Just below that example, there is a cautionary note regarding the setting for bands().

      Code:
      Cautions
      
          The graph shown above illustrates a common problem with this technique:  it tracks
          wiggles that may not be real and can introduce wiggles if too many bands are chosen.  An
          improved version of the graph above would be
      
              . scatter mpg weight, msize(*.5) || mspline mpg weight, bands(8)
      I was curious about the default setting for bands(), and found this:

      Code:
          bands(#) specifies the number of bands for which cross medians should be calculated.
              The default is max{min(b1,b2),b3}, where b1 is round{10*log10(N)}, b2 is
              round{sqrt(N)}, b3 is min(2,N), and N is the number of observations.
      I worked out that bands(9) was the default setting for this example. Then I tried again with bands(8).

      Code:
      clear *
      sysuse auto
      summarize mpg weight price
      
      twoway scatter mpg weight, msize(*.5) ///
       || mspline mpg weight ///
       || scatter price weight, yaxis(2) ///
       || mspline price weight, yaxis(2)
      graph rename g1
      
      display "Bands = " max(min(round(10*log10(_N)),round(sqrt(_N))),min(2,_N))
      
      // Change default for bands() from 9 to 8
      twoway scatter mpg weight, msize(*.5) ///
       || mspline mpg weight, bands(8) ///
       || scatter price weight, yaxis(2) ///
       || mspline price weight, yaxis(2) bands(8)
      graph rename g2
      HTH.
      --
      Bruce Weaver
      Email: [email protected]
      Version: Stata/MP 19.5 (Windows)

      Comment


      • #4
        mspline is great for showing as a smooth curve which you know already to be a smooth curve.

        As a smoothing method it is idiosyncratic and thus needs to be explained unless reviewers or audience are happy with "I don't know" or "I did it in Stata" as a story. The big picture is slicing into bands, finding (median of y, median of x) in each band and joining those summary points. The result can be quirky and sensitive to band number and boundaries, as I think Bruce Weaver is hinting.

        For regularly spaced time series, I would usually start with tssmooth nl -- which allows multiple repetitions of Hanning and thus any binomial filter (which are eminently linear).

        For scatter plot smoothing, I strongly recommend lpoly as highly flexible (pun intended) and as easy to explain (in the sense that you can throw literature references back at any questioners).

        But if a series really is "all over the place", no smoother can tame it!

        Comment

        Working...
        X