combine boxplots over more than one variable

Max Piper

Join Date: Dec 2015

Posts: 61
#1

combine boxplots over more than one variable

15 Sep 2016, 02:18

Code:

. sysuse nlsw88.dta (NLSW, 1988 extract) . graph box tenure wage, over(south)

gives the following output. It allows to compare the effect of "south" on job tenure and hourly wage.

I want to achieve something different. I want to have such a box plot one variable, but over two different variabes, such as

Code:

. sysuse nlsw88.dta (NLSW, 1988 extract) . graph box wage, over(south, c_city)

This code, if it didn't give an error, would allow to compare not the effect of one categorical variable onto two different continuous variables, but compare two categorical variables with respect to the effect they have on one continuous variable.

In my own dataset, I want to do this for the following reason: I have many different variables (psycho_1, psycho_2, psycho_3, psycho_4, ...) that attempt to explain facets of an overlying "construct" (psychopathy), and I want to quantify this construct by combining these variables into a one-dimensional thing (psychopathy score).
I would define the psychopathy score in slightly varying ways, such as

Code:

egen psychopathy_1 = psycho_1 + psycho_2 + psycho_3 + psycho_4 egen psychopathy_2 = 2*psycho_1 + psycho_2 + 3*psycho_3 + psycho_4

and then form categories as in

Code:

xtile psychquant_1 = psychopathy_1, nq(3)

and then visualize robustness of these one-dimensional psychopathy constructs by showing that their effects on some other variable (in the above example, wage) are similar.

The only way I can do that now would be

Code:

. graph box wage, over(psychquant_1) . graph box wage, over(psychquant_2)

or, using the above example

Code:

. graph box wage, over(south) . graph box wage, over(c_city)

but the comparison I want to stress would force the reader to compare the first box of the first graph to the first box in the second graph, and the second box from each graph, etc., which is not suitable, is it?

Last edited by Max Piper; 15 Sep 2016, 02:59.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35696
#2

15 Sep 2016, 03:56

I got lost half-way through this, absent an example of, or like, your own data. But I think your other examples help. See

SJ-14-4 gr0062 . . . . . . . . . . . . Stata tip 121: Box plots side by side
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/14 SJ 14(4):991--996 (no commands)
tip on how to and which data should be plotted side by
side with box plots

That requires payment or a subscription until 2017q4, so this technique may help.

Code:

sysuse nlsw88.dta, clear graph box wage, over(south) name(g1, replace) l1title(south) graph box wage, over(c_city) name(g2, replace) l1title(c_city) graph combine g1 g2, name(G1) stack wage south wage c_city, into(wage whatever) clear label define whatever 0 zero 1 one label val whatever whatever label define _stack 1 south 2 c_city label val _stack _stack graph box wage, over(_stack) by(whatever) name(G2)
Comment

Announcement

combine boxplots over more than one variable

Comment