Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Add p-values comparing different regressions to coefplot

    Hi all,

    I am conducting heterogeneity analysis and presenting the results using coefplot. I have two sets of regressors and several heterogeneity groups.
    I would like to add the p-values to the plot, showing which coefficient estimates are statistically different. However, with the divided coefplot, the same p-values show up on both sides---but I would like to include different p-values on the right and the left side. I am including the output I am currently getting -- as you can see, the same p-values are added on both the right and left side so that they are currently one on top of the other. I am also including STATA code that can be run below.

    Thank you very much!

    OUTPUT:
    Click image for larger version

Name:	coefplot.png
Views:	1
Size:	78.4 KB
ID:	1719815



    STATA CODE:

    ** Load data
    use "https://www.stata-press.com/data/r18/auto", clear

    ** Create binary variables for heterogeneity
    foreach var of varlist weight length trunk turn mpg displacement{
    sum `var', det
    g high_`var'=`var'>`r(p50)'
    }

    ** Do this for two x variables
    foreach group in mpg displacement{

    ** And for four heterogeneity variables
    foreach het in weight length trunk turn{

    ** And for each level of the heterogeneity variable
    forval h=0/1{

    ** Create x regressor
    cap drop reg_`het'
    g reg_`het'=high_`group'

    ** Run regression
    qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'

    ** Score coefficient estimate
    scalar e_`het'_`h'=_b[reg_`het']
    est store e_`group'`het'_`h'

    }

    ** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
    suest reg0 reg1
    test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0

    ** Store the p-value
    local pval=`r(p)'
    local p`group'`het': display %4.3f `pval'
    }
    }

    * Output the coefficient plot
    coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median, size(vlarge)) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
    (e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
    text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
    || (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
    (e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
    text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
    ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
    xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")
    Last edited by Nina Caroline; 08 Jul 2023, 22:43.

  • #2
    coefplot is from SSC, as you are asked to explain (FAQ Advice #12). Also, read through the FAQ Advice on how to spell Stata. As the command calls twoway, see https://journals.sagepub.com/doi/pdf...6867X211063413 which illustrates how to add variable text to graphs that use a -by()- option. You do this via the -addplot()- option. With the same scale and list of variables, another approach is to use grc1leg2, which I illustrate below.

    Code:
    net install grc1leg2, from(http://digital.cgdev.org/doc/stata/MO/Misc)
    Code:
    ** Load data
    use "https://www.stata-press.com/data/r18/auto", clear
    
    ** Create binary variables for heterogeneity
    foreach var of varlist weight length trunk turn mpg displacement{
    sum `var', det
    g high_`var'=`var'>`r(p50)'
    }
    
    ** Do this for two x variables
    foreach group in mpg displacement{
    
    ** And for four heterogeneity variables
    foreach het in weight length trunk turn{
    
    ** And for each level of the heterogeneity variable
    forval h=0/1{
    
    ** Create x regressor
    cap drop reg_`het'
    g reg_`het'=high_`group'
    
    ** Run regression
    qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'
    
    ** Score coefficient estimate
    scalar e_`het'_`h'=_b[reg_`het']
    est store e_`group'`het'_`h'
    
    }
    
    ** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
    suest reg0 reg1
    test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0
    
    ** Store the p-value
    local pval=`r(p)'
    local p`group'`het': display %4.3f `pval'
    }
    }
    
    * Output the coefficient plot
    set scheme s1mono
    *CREATE GRAPH 1 SUPPRESSING XTITLE
    coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
    (e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
    text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
    ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
    xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("")
    gr save gr1, replace
    *CREATE GRAPH 2 SUPPRESSING BOTH THE XTITLE AND YLABELS AND ADD A MARGIN GAP
    coefplot (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
    (e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
    text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
    ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
    xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("") ysc(outergap(*24)) ylab(none)
    gr save gr2, replace 
    *COMBINE SPECIFYING TITLE
    grc1leg2 gr1.gph gr2.gph, xcommon ycommon b1title("Coefficient on {it: Y}", pos(12) ring(0)) 
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	35.6 KB
ID:	1719818

    Last edited by Andrew Musau; 08 Jul 2023, 23:58.

    Comment


    • #3
      Thanks so much!! For long labels (see example below), the left graph becomes much narrower than the right graph. Is there a way to have both graph windows be the same width without a massive white gap between them?

      Code:
      ** Load data
      use "https://www.stata-press.com/data/r18/auto", clear
      
      ** Create binary variables for heterogeneity
      foreach var of varlist weight length trunk turn mpg displacement{
      sum `var', det
      g high_`var'=`var'>`r(p50)'
      }
      
      ** Do this for two x variables
      foreach group in mpg displacement{
      
      ** And for four heterogeneity variables
      foreach het in weight length trunk turn{
      
      ** And for each level of the heterogeneity variable
      forval h=0/1{
      
      ** Create x regressor
      cap drop reg_`het'
      g reg_`het'=high_`group'
      
      ** Run regression
      qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'
      
      ** Score coefficient estimate
      scalar e_`het'_`h'=_b[reg_`het']
      est store e_`group'`het'_`h'
      
      }
      
      ** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
      suest reg0 reg1
      test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0
      
      ** Store the p-value
      local pval=`r(p)'
      local p`group'`het': display %4.3f `pval'
      }
      }
      
      * Output the coefficient plot
      set scheme s1mono
      *CREATE GRAPH 1 SUPPRESSING XTITLE
      coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
      (e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
      text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
      ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = `" "Rel dictator game" "transfers to women" "' reg_length = `" "Agreement with paternalism" "laws against women" "' reg_turn = `" "Agreement women" "should be protected" "' reg_trunk = `" "Agreement women" "should not work at night" "', labsize(medlarge)) ///
      xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("")
      gr save gr1, replace
      *CREATE GRAPH 2 SUPPRESSING BOTH THE XTITLE AND YLABELS AND ADD A MARGIN GAP
      coefplot (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
      (e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
      text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
      ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = `" "Rel dictator game" "transfers to women" "' reg_length = `" "Agreement with paternalism" "laws against women" "' reg_turn = `" "Agreement women" "should be protected" "' reg_trunk = `" "Agreement women" "should not work at night" "', labsize(medlarge)) ///
      xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xsc(alt) xtitle("") ysc(outergap(*24)) ylab(none)
      gr save gr2, replace 
      *COMBINE SPECIFYING TITLE
      grc1leg2 gr1.gph gr2.gph, xcommon ycommon b1title("Coefficient on {it: Y}", pos(12) ring(0))
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	101.2 KB
ID:	1719870

      Comment


      • #4
        Unless you have a stacked display where the y-labels are repeated, I do not think there is an elegant way to address this. I would revert to solution #1: adding variable text to graphs that use a -by()- option. This should be more straightforward.

        Comment


        • #5
          Here is a dirty workaround using the graph editor following your code in #1. You have 4 pairs of p-values, so delete 5-8 from the first subgraph and 1-4 from the second.

          Code:
          ** Load data
          use "https://www.stata-press.com/data/r18/auto", clear
          
          ** Create binary variables for heterogeneity
          foreach var of varlist weight length trunk turn mpg displacement{
          sum `var', det
          g high_`var'=`var'>`r(p50)'
          }
          
          ** Do this for two x variables
          foreach group in mpg displacement{
          ** And for four heterogeneity variables
          foreach het in weight length trunk turn{
          
          ** And for each level of the heterogeneity variable
          forval h=0/1{
          
          ** Create x regressor
          cap drop reg_`het'
          g reg_`het'=high_`group'
          
          ** Run regression
          qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'
          
          ** Score coefficient estimate
          scalar e_`het'_`h'=_b[reg_`het']
          est store e_`group'`het'_`h'
          
          }
          
          ** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
          suest reg0 reg1
          test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0
          
          ** Store the p-value
          local pval=`r(p)'
          local p`group'`het': display %4.3f `pval'
          }
          }
          
          * Output the coefficient plot
          set scheme s1mono
          coefplot (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median, size(vlarge)) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
          (e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
          text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
          || (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
          (e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
          text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
          ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
          xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")
          
          forval i=5/8{
              gr_edit .plotregion1.plotregion1[1].textbox`i'.text = {}
          }
          forval i=1/4{
              gr_edit .plotregion1.plotregion1[2].textbox`i'.text = {}
          }
          Click image for larger version

Name:	coefplot.png
Views:	1
Size:	27.9 KB
ID:	1719883




          On a technical note, if p<0.001, write it as such and not "p=0.000". See how I do this using the -cond()- function in #2: https://www.statalist.org/forums/for...gression-table

          Comment


          • #6
            Amazing -- thanks so much. This is perfect!!

            Comment


            • #7
              Sorry, a quick follow-up question. How do I add an overall title to this? When I add "byopts(title("Differences in Treatment Effects"))" I get the error message: "option byopts() not allowed"

              Comment


              • #8
                You already have -byopts(xrescale)-, so don't include an additional one.

                Code:
                byopts(xrescale title(Differences in Treatment Effects))

                Comment


                • #9
                  Thanks so much, this is fantastic!! Sorry, last question (hopefully for a few years ). I am now adding a third coefplot to that graph. But the three individual plots are shown in two rows. How do I put them in one row, so with three individual coefplots next to each other as opposed to two in row 1 and one in row 2. Example plot and output below. Thanks!!

                  Code:
                  use "https://www.stata-press.com/data/r18/auto", clear
                  
                  ** Create binary variables for heterogeneity
                  foreach var of varlist weight length trunk turn mpg displacement{
                  sum `var', det
                  g high_`var'=`var'>`r(p50)'
                  }
                  
                  ** And for four heterogeneity variables
                  foreach het in weight length trunk turn{
                  
                  ** Do this for two x variables
                  foreach group in `het' mpg displacement{
                  
                  ** And for each level of the heterogeneity variable
                  forval h=0/1{
                  
                  if "`group'"!="`het'" | `h'==0{
                  
                  ** Create x regressor
                  cap drop reg_`het'
                  g reg_`het'=high_`group'
                  
                  ** Run regression
                  if "`group'"!="`het'"{
                  qui: eststo reg`h': reg price reg_`het' if high_`het'==`h'
                  }
                  if "`group'"=="`het'"{
                  qui: eststo reg`h': reg price reg_`het'
                  }
                  
                  ** Score coefficient estimate
                  scalar e_`het'_`h'=_b[reg_`het']
                  est store e_`group'`het'_`h'
                  }
                  }
                  
                  if "`group'"!="`het'"{
                  ** Test whether coefficient estimates are significantly different from each other in the two heterogeneity levels
                  suest reg0 reg1
                  test [reg0_mean]reg_`het' - [reg1_mean]reg_`het' = 0
                  
                  ** Store the p-value
                  local pval=`r(p)'
                  local p`group'`het': display %4.3f `pval'
                  }
                  }
                  }
                  
                  
                  * Output the coefficient plot
                  set scheme s1mono
                  coefplot (e_weightweight_0 e_lengthlength_0 e_trunktrunk_0 e_turnturn_0, label(>Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)), bylabel({it: Level}) ///
                  || (e_mpgweight_0 e_mpglength_0 e_mpgtrunk_0 e_mpgturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red) text(1 2500 "p=`pmpgweight'")) ///
                  (e_mpgweight_1 e_mpgtrunk_1 e_mpglength_1 e_mpgturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) text(2 2500 "p=`pmpglength'") ///
                  text(3 2500 "p=`pmpgtrunk'") text(4 2500 "p=`pmpgturn'")), bylabel({it: MPG}) ///
                  || (e_displacementweight_0 e_displacementlength_0 e_displacementtrunk_0 e_displacementturn_0, label(<= Median) ciopts(color(red)) mcolor(red) mfc(red) m(diamond) lcolor(red)) ///
                  (e_displacementweight_1 e_displacementlength_1 e_displacementtrunk_1 e_displacementturn_1, label(> Median) ciopts(color(black) lp(dash)) mcolor(black) mfc(black) ///
                  text(1 2500 "p=`pdisplacementweight'") text(2 2500 "p=`pdisplacementlength'") text(3 2500 "p=`pdisplacementtrunk'") text(4 2500 "p=`pdisplacementturn'")), bylabel({it: Displacement}) ///
                  ||, keep(reg_weight reg_length reg_trunk reg_turn) coeflabels(reg_weight = "Weight" reg_length = "Length" reg_turn = "Turn" reg_trunk = "Trunk", labsize(medlarge)) ///
                  xline(0, lcolor(black)) yline(1.5 2.5 3.5 4.5, lcolor(black)) hor graphregion(color(white)) bgcol(white) byopts(xrescale) name(coefplot, replace) xtitle("Coefficient on {it: Y}")
                  
                  forval i=1/8{
                  gr_edit .plotregion1.plotregion1[1].textbox`i'.text = {}
                  }
                  forval i=5/8{
                  gr_edit .plotregion1.plotregion1[2].textbox`i'.text = {}
                  }
                  forval i=1/4{
                  gr_edit .plotregion1.plotregion1[3].textbox`i'.text = {}
                  }
                  Click image for larger version

Name:	coefplot.png
Views:	2
Size:	90.6 KB
ID:	1720482
                  Attached Files

                  Comment


                  • #10
                    Code:
                    byopts(xrescale title(Differences in Treatment Effects) rows(1))

                    Comment


                    • #11
                      Thank you!

                      Comment

                      Working...
                      X