Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scatterplot

    I would like to plot the mean of a variable, with 2 stdev and sort the data by the y variable (lowest value to largest value). So the Y axis would have labels Domestic and Foreign where foreign would be higher on the y than domestic. In this case, it is easy to switch the numerical value, but in my data I have 102 categories.

    My only work around is creating a new categorical variable from scratch sorted on the y but I cannot get the labels to work from the old categorical variable. I was hoping there was an option but sort() does not seem to work for this.

    sysuse auto
    collapse (mean) y = price (semean) se_y = price, by(foreign)
    gen yu = y + 1.96*se_y
    gen yl = y - 1.96*se_y

    twoway (scatter foreign y) (rcap yu yl foreign, horizontal)

  • #2
    1.96 is a crude approximation to the multiplier you need, which depends on sample size and may be larger.

    It is easy to get Stata to draw a confidence interval directly, but as you say sorting the categories by the mean but keeping their labels is trickier. This solution uses labmask which can install from

    Code:
     search labmask, sj
    
    Search of official help files, FAQs, Examples, and Stata Journals
    
    SJ-8-2  gr0034  . . . . . . . . . .  Speaking Stata: Between tables and graphs
            (help labmask, seqvar if installed) . . . . . . . . . . . .  N. J. Cox
            Q2/08   SJ 8(2):269--289
            outlines techniques for producing table-like graphs
    (cliick on gr0034 which will be in blue).


    [CODE]
    webuse nlswork, clear
    statsby , by(grade) : ci mean ln_wage
    sort mean
    list
    gen x = _n
    labmask x, values(grade)
    twoway scatter mean x || rcap ub lb x, xla(1/18, valuelabel) ytitle(ln_wage mean) xtitle(grade) legend(off) subtitle(95% confidence intervals for mean)
    [/CODE[


    See also https://www.stata-journal.com/sjpdf.html?articlenum=gr0045

    That said, it's hard to see how you can plot 102 categories without messed up axis labels.



    Comment


    • #3
      Thanks, and I am thinking through the category problem. The one thing that is not working is that the label is the encoded numerical value and not the value label. That was probably unclear but is this possible? Otherwise, this is exactly what I needed.

      Comment


      • #4
        labmask has an option for this circumstance, decode I think.

        Comment

        Working...
        X