Add p-values comparing different regressions to coefplot

Nina Caroline

Join Date: Jul 2015

Posts: 14
#1

Add p-values comparing different regressions to coefplot

08 Jul 2023, 22:28

Hi all,

I am conducting heterogeneity analysis and presenting the results using coefplot. I have two sets of regressors and several heterogeneity groups.
I would like to add the p-values to the plot, showing which coefficient estimates are statistically different. However, with the divided coefplot, the same p-values show up on both sides---but I would like to include different p-values on the right and the left side. I am including the output I am currently getting -- as you can see, the same p-values are added on both the right and left side so that they are currently one on top of the other. I am also including STATA code that can be run below.

Thank you very much!

OUTPUT:

STATA CODE:

** Load data
use "https://www.stata-press.com/data/r18/auto", clear

** Create binary variables for heterogeneity
foreach var of varlist weight length trunk turn mpg displacement{
sum `var', det
g high_`var'=`var'>`r(p50)'
}

** Do this for two x variables
foreach group in mpg displacement{

** And for four heterogeneity variables
foreach het in weight length trunk turn{

** And for each level of the heterogeneity variable
forval h=0/1{

** Create x regressor
cap drop reg_`het'
g reg_`het'=high_`group'

** Run regression
qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'

** Score coefficient estimate
scalar e_`het'_`h'=_b[reg_`het']
est store e_`group'`het'_`h'

}

** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
suest reg0 reg1
test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

** Store the p-value
local pval=`r(p)'
local p`group'`het': display %4.3f `pval'
}
}

* Output the coefficient plot
coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median, size(vlarge)) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
(e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
|| (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
(e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")

Last edited by Nina Caroline; 08 Jul 2023, 22:43.
Tags: addtext, coefplot, graph, label, regression

Andrew Musau

Join Date: Oct 2014
Posts: 10213

08 Jul 2023, 23:55

coefplot is from SSC, as you are asked to explain (FAQ Advice #12). Also, read through the FAQ Advice on how to spell Stata. As the command calls twoway, see https://journals.sagepub.com/doi/pdf...6867X211063413 which illustrates how to add variable text to graphs that use a -by()- option. You do this via the -addplot()- option. With the same scale and list of variables, another approach is to use grc1leg2, which I illustrate below.

Code:

net install grc1leg2, from(http://digital.cgdev.org/doc/stata/MO/Misc)

Code:

** Load data
use "https://www.stata-press.com/data/r18/auto", clear

** Create binary variables for heterogeneity
foreach var of varlist weight length trunk turn mpg displacement{
sum `var', det
g high_`var'=`var'>`r(p50)'
}

** Do this for two x variables
foreach group in mpg displacement{

** And for four heterogeneity variables
foreach het in weight length trunk turn{

** And for each level of the heterogeneity variable
forval h=0/1{

** Create x regressor
cap drop reg_`het'
g reg_`het'=high_`group'

** Run regression
qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'

** Score coefficient estimate
scalar e_`het'_`h'=_b[reg_`het']
est store e_`group'`het'_`h'

}

** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
suest reg0 reg1
test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

** Store the p-value
local pval=`r(p)'
local p`group'`het': display %4.3f `pval'
}
}

* Output the coefficient plot
set scheme s1mono
*CREATE GRAPH 1 SUPPRESSING XTITLE
coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
(e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("")
gr save gr1, replace
*CREATE GRAPH 2 SUPPRESSING BOTH THE XTITLE AND YLABELS AND ADD A MARGIN GAP
coefplot (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
(e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("") ysc(outergap(*24)) ylab(none)
gr save gr2, replace 
*COMBINE SPECIFYING TITLE
grc1leg2 gr1.gph gr2.gph, xcommon ycommon b1title("Coefficient on {it: Y}", pos(12) ring(0))

Click image for larger version

Name: Graph.png
Views: 1
Size: 35.6 KB
ID: 1719818

Last edited by Andrew Musau; 08 Jul 2023, 23:58.

Comment

Nina Caroline

Join Date: Jul 2015
Posts: 14

09 Jul 2023, 14:26

Thanks so much!! For long labels (see example below), the left graph becomes much narrower than the right graph. Is there a way to have both graph windows be the same width without a massive white gap between them?

Code:

** Load data
use "https://www.stata-press.com/data/r18/auto", clear

** Create binary variables for heterogeneity
foreach var of varlist weight length trunk turn mpg displacement{
sum `var', det
g high_`var'=`var'>`r(p50)'
}

** Do this for two x variables
foreach group in mpg displacement{

** And for four heterogeneity variables
foreach het in weight length trunk turn{

** And for each level of the heterogeneity variable
forval h=0/1{

** Create x regressor
cap drop reg_`het'
g reg_`het'=high_`group'

** Run regression
qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'

** Score coefficient estimate
scalar e_`het'_`h'=_b[reg_`het']
est store e_`group'`het'_`h'

}

** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
suest reg0 reg1
test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

** Store the p-value
local pval=`r(p)'
local p`group'`het': display %4.3f `pval'
}
}

* Output the coefficient plot
set scheme s1mono
*CREATE GRAPH 1 SUPPRESSING XTITLE
coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
(e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = `" "Rel dictator game" "transfers to women" "' reg_length = `" "Agreement with paternalism" "laws against women" "' reg_turn = `" "Agreement women" "should be protected" "' reg_trunk = `" "Agreement women" "should not work at night" "', labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("")
gr save gr1, replace
*CREATE GRAPH 2 SUPPRESSING BOTH THE XTITLE AND YLABELS AND ADD A MARGIN GAP
coefplot (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
(e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = `" "Rel dictator game" "transfers to women" "' reg_length = `" "Agreement with paternalism" "laws against women" "' reg_turn = `" "Agreement women" "should be protected" "' reg_trunk = `" "Agreement women" "should not work at night" "', labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("") ysc(outergap(*24)) ylab(none)
gr save gr2, replace 
*COMBINE SPECIFYING TITLE
grc1leg2 gr1.gph gr2.gph, xcommon ycommon b1title("Coefficient on {it: Y}", pos(12) ring(0))

Click image for larger version

Name: Graph.png
Views: 1
Size: 101.2 KB
ID: 1719870

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10213
#4

09 Jul 2023, 17:35

Unless you have a stacked display where the y-labels are repeated, I do not think there is an elegant way to address this. I would revert to solution #1: adding variable text to graphs that use a -by()- option. This should be more straightforward.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10213

09 Jul 2023, 18:22

Here is a dirty workaround using the graph editor following your code in #1. You have 4 pairs of p-values, so delete 5-8 from the first subgraph and 1-4 from the second.

Code:

** Load data
use "https://www.stata-press.com/data/r18/auto", clear

** Create binary variables for heterogeneity
foreach var of varlist weight length trunk turn mpg displacement{
sum `var', det
g high_`var'=`var'>`r(p50)'
}

** Do this for two x variables
foreach group in mpg displacement{
** And for four heterogeneity variables
foreach het in weight length trunk turn{

** And for each level of the heterogeneity variable
forval h=0/1{

** Create x regressor
cap drop reg_`het'
g reg_`het'=high_`group'

** Run regression
qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'

** Score coefficient estimate
scalar e_`het'_`h'=_b[reg_`het']
est store e_`group'`het'_`h'

}

** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
suest reg0 reg1
test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

** Store the p-value
local pval=`r(p)'
local p`group'`het': display %4.3f `pval'
}
}

* Output the coefficient plot
set scheme s1mono
coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median, size(vlarge)) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
(e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
|| (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
(e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")

forval i=5/8{
    gr_edit .plotregion1.plotregion1[1].textbox`i'.text = {}
}
forval i=1/4{
    gr_edit .plotregion1.plotregion1[2].textbox`i'.text = {}
}

Click image for larger version

Name: coefplot.png
Views: 1
Size: 27.9 KB
ID: 1719883

On a technical note, if p<0.001, write it as such and not "p=0.000". See how I do this using the -cond()- function in #2: https://www.statalist.org/forums/for...gression-table

Comment

Nina Caroline

Join Date: Jul 2015

Posts: 14
#6

13 Jul 2023, 07:17

Amazing -- thanks so much. This is perfect!!
Comment
Nina Caroline

Join Date: Jul 2015

Posts: 14
#7

14 Jul 2023, 02:43

Sorry, a quick follow-up question. How do I add an overall title to this? When I add "byopts(title("Differences in Treatment Effects"))" I get the error message: "option byopts() not allowed"
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10213
#8

14 Jul 2023, 03:02

You already have -byopts(xrescale)-, so don't include an additional one.

Code:

byopts(xrescale title(Differences in Treatment Effects))
Comment

Nina Caroline

Join Date: Jul 2015
Posts: 14

14 Jul 2023, 03:46

Thanks so much, this is fantastic!! Sorry, last question (hopefully for a few years

). I am now adding a third coefplot to that graph. But the three individual plots are shown in two rows. How do I put them in one row, so with three individual coefplots next to each other as opposed to two in row 1 and one in row 2. Example plot and output below. Thanks!!

Code:

use "https://www.stata-press.com/data/r18/auto", clear

** Create binary variables for heterogeneity
foreach var of varlist weight length trunk turn mpg displacement{
sum `var', det
g high_`var'=`var'>`r(p50)'
}

** And for four heterogeneity variables
foreach het in weight length trunk turn{

** Do this for two x variables
foreach group in `het' mpg displacement{

** And for each level of the heterogeneity variable
forval h=0/1{

if "`group'"!="`het'" | `h'==0{

** Create x regressor
cap drop reg_`het'
g reg_`het'=high_`group'

** Run regression
if "`group'"!="`het'"{
qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'
}
if "`group'"=="`het'"{
qui: eststo reg`h': reg price reg_`het'
}

** Score coefficient estimate
scalar e_`het'_`h'=_b[reg_`het']
est store e_`group'`het'_`h'
}
}

if "`group'"!="`het'"{
** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
suest reg0 reg1
test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

** Store the p-value
local pval=`r(p)'
local p`group'`het': display %4.3f `pval'
}
}
}


* Output the coefficient plot
set scheme s1mono
coefplot (e_weightweight_0 e_lengthlength_0 e_trunktrunk_0 e_turnturn_0, label(>Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)), bylabel({it: Level}) ///
|| (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
(e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
|| (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
(e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")

forval i=1/8{
gr_edit .plotregion1.plotregion1[1].textbox`i'.text = {}
}
forval i=5/8{
gr_edit .plotregion1.plotregion1[2].textbox`i'.text = {}
}
forval i=1/4{
gr_edit .plotregion1.plotregion1[3].textbox`i'.text = {}
}

Click image for larger version

Name: coefplot.png
Views: 2
Size: 90.6 KB
ID: 1720482

Attached Files

Comment

Andrew Musau

Join Date: Oct 2014

Posts: 10213
#10

14 Jul 2023, 03:54

Code:

byopts(xrescale title(Differences in Treatment Effects) rows(1))
Comment
Nina Caroline

Join Date: Jul 2015

Posts: 14
#11

14 Jul 2023, 04:44

Thank you!
Comment

Announcement