Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing Ovals/Ellipses?

    I know that one can plot circles in stata using a weighted scatterplot. Is it possible to plot ovals, or ellipses along the same lines? (I.e., longer in one dimension than the other).

    Background: I'm running a differences in differences model where I'm allowing the treatment effect to vary by groups, with around 10 groups total. I have two different outcomes and run two separate regressions, resulting in two sets of coefficients. I have been making a graph where I plot the group treatment effects on each outcome as a scatterplot (i.e., each point is a group, Y-axis is the treatment effect in that group for outcome 1, X axis is the treatment effect of the group on outcome 2). I'd like to try adding confidence intervals in the form of an oval around each point, where the vertical diameter is the 95% confidence interval for one outcome's treatment effect, and the horizontal diameter is the 95% confidence interval for the other.

    (For the record, I realize that (a) this may be illegible, and (b) this isn't really statistically accurate, since to plot a true "confidence area" you'd have to know something about the covariance between the coefficients in the two separate regressions. Still, I'd like to know if it's even possible, so please humor me )

  • #2
    Ryan, I am not sure exactly what ellipses you try to graph since you first refer to a weighted scatterplot and then give an example without it. You can certainly graph ellipses without weights together with scatterplots in Stata, for example using my program ellip or polarsm by Nick Cox.


    Comment


    • #3
      Anders: Thanks for the response. To clarify, I'm seeking a graph that has the basic look of a weighted scatterplot, like the one below from the Stata FAQs, except with ellipses instead of circles, where I would supply (presumably in several variables), the center and the diameter of the axes of each ellipse. I'll take a look at ellip and polarsm, though at a quick glance at the help file for ellip, it's not clear this does quite what I'm looking for.



      Except with ellipses,

      Comment


      • #4
        polarsm (Stata Journal) is not what you are looking for.

        What determines the axis lengths and the orientation of each ellipse?

        In essence the way to code this is as a loop over twoway function, or so I guess, but I am not volunteering any code!

        Comment


        • #5
          I was figuring the axis lengths would be provided as a pair of variable names, and that one axis would be parallel to the plot X-axis, and the other parallel to the Y axis. So you'd provide something like
          Code:
          twoway ellipses ycenter xcenter ylength xlength
          , where ycenter is the y-coordinate for the center of the ellipse, xcenter the x-coordinate, and ylength and xlength would be the axis lengths parallel to the y and x axes, respectively.

          Sounds like this isn't really feasible (or at least would require a lot of work for a fairly specialized graph). Since I'm ultimately trying to plot confidence intervals, a pair of rcap plots does well enough for this purpose (one horizontal, one vertical), though I think it's a little harder to read than ellipses would be.

          Thanks for the responses!
          Last edited by Ryan Sandler; 21 Jan 2020, 09:50. Reason: Forgot to add a thank you :)

          Comment


          • #6
            So tacitly your ellipse is just parallel to the axes and not tilted.

            Loosely similar in spirit: I wrote a diplot in 2005, which is on SSC. https://www.stata.com/statalist/arch.../msg00351.html was the announcement, which evoked zero interest, it seems. ,

            Code:
            ssc install diplot

            Comment


            • #7
              I don't think I would use ellipses in this situation.

              Imagine you have data on height and weight. We know that those variables are correlated. I took the graphic below from a question on Researchgate. Here, the poster had data on individuals. In addition to doing a scatterplot of individuals' height and weight, it makes sense to calculate, by gender, the mean height and the variance in height, the mean weight and its variance, and the covariance between height and weight.



              You are just estimating average treatment effects. Those are, by their nature, average, so the ATE for each member in a group is going to be the same. There's no variance in ATE. You also alluded to the fact that you don't estimate a covariance between the ATEs. Since the outcome doesn't vary among members in a group, I don't think there really is a covariance.

              I know you have 10 groups, so that's a lot of plotting. Maybe you can consider using something like coefplot (by Ben Jann, available on the Statistical Software Components site; type ssc install coefplot to install)
              Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

              When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Comment


              • #8
                Which weights do you want and why? Stata supports four different types of weights. scatter supports weights but ellip does not support weights (although you could first expand the dataset to get the equivalent of frequency weights which might be useful). In ellip, the size of the ellipse is determined by the boundary constant -- not by weights.

                Do you want a graph similar to #6 or #7 or something different? If similar to #7, then here is some Stata sample code using a combination of twoway and ellip on the Auto data which produces two confidence ellipses (means-centered around variables) and two scatterplots in an overlaid graph.

                Code:
                sysuse auto, clear
                
                gen mpg0=mpg if foreign==0
                gen price0=price if foreign==0
                gen mpg1=mpg if foreign==1
                gen price1=price if foreign==1
                
                qui ellip mpg price if foreign==0, gen(y1 x1)
                qui ellip mpg price if foreign==1, gen(y2 x2)
                
                twoway (scatter mpg0 price0) (scatter mpg1 price1) ///
                    (line y1 x1) (line y2 x2), ///
                    legend(label(1 "Domestic") label(2 "Foreign") ///
                    label(3 "Domestic") label(4 "Foreign"))  ///
                    ytitle("Mileage (mpg)") xtitle("Price")
                Last edited by Anders Alexandersson; 22 Jan 2020, 14:36. Reason: Fixed typo

                Comment


                • #9
                  Ryan, as Nick asked in #4, what determines the axis lengths? Alternatively, how do you calculate the correlation matrix?

                  In terms of data, you referred in #3 to this Stata FAQ https://www.stata.com/support/faqs/g...ghted-markers/
                  which has this code:

                  Code:
                  webuse census
                  scatter death medage [w=pop65p], msymbol(circle_hollow)
                  The dataset has 50 observations with one circle per population weight. You want instead one ellipse per weighted observation. But there are insufficient observations to compute the needed correlations:

                  Code:
                  by pop65p, sort: corr death medage
                  In #8, I suggested to first expand the dataset to get the equivalent of frequency weights. That is easy.
                  But there are still insufficient observations to create the correlation matrix:

                  Code:
                  expand pop65p
                  by pop65p, sort: corr death medage
                  Switching from "data ellipses" (which are centered around variable means) above to "confidence ellipses" (which are centered around regression coefficients) is easy enough but you/we still need to be able to compute the the ellipse.

                  Comment

                  Working...
                  X