Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction Plot

    Hello everyone,

    I am having a problem with an interaction plot.

    I ran a regression as follows:

    xtreg Tobins_Q m_a cash c.cash#c.m_a $controls,fe
    • Tobins_Q depicts the dependent variable
    • m_a depicts the moderator
    • cash depicts the independent variable
    • $control depicts all control variables
    I now want to show this moderation effect in a graph. I have already tried the commands "twoway" and "margins" "marginsplot", but I can't find the right way.

    My graph should look like this:

    y-axis = Tobins_Q
    x-axis = cash and it should be divided into high and low

    In this vector, the variable m_a should be illustrated also with high and low distinction

    Can anyone help me with this?

    Thanks in advance!

    Best,

    Michael Rumpf

  • #2
    I don't exactly understand what you mean with the distinctions. But if you want to show different graphs for a subset of you cash variable (a 'high' and a 'low' graph), why not build a new variable that only has the high values and all else missings and a 'low' variable which has the low values and all else missings and twoway plot these one by one with Tobins Q on the y axis? Stata should get rid of the missings automatically. I'm sure there is some faceting command somewhere but I'm not a Stata expert.

    Comment


    • #3
      Maybe you could try something along these line:
      Code:
      sysuse auto, clear
      regress length c.headroom##c.trunk
      
      centile trunk, centile(25 50 75)
      local t25 = r(c_1)
      local t50 = r(c_2)
      local t75 = r(c_3)
      
      // low on left
      margins , at((p25) headroom trunk = (`t25' `t50' `t75'))
      marginsplot , name(left) title("") ylabel( , angle(horizontal) nogrid) noci
      
      // high on right
      margins , at((p75) headroom trunk = (`t25' `t50' `t75'))
      marginsplot , name(right) ylabel(none, nogrid) title("") ytitle("") noci
      
      // then
      graph combine left right, ycommon imargin(0)

      Comment


      • #4
        Originally posted by Joseph Coveney View Post
        Maybe you could try something along these line:
        Code:
        sysuse auto, clear
        regress length c.headroom##c.trunk
        
        centile trunk, centile(25 50 75)
        local t25 = r(c_1)
        local t50 = r(c_2)
        local t75 = r(c_3)
        
        // low on left
        margins , at((p25) headroom trunk = (`t25' `t50' `t75'))
        marginsplot , name(left) title("") ylabel( , angle(horizontal) nogrid) noci
        
        // high on right
        margins , at((p75) headroom trunk = (`t25' `t50' `t75'))
        marginsplot , name(right) ylabel(none, nogrid) title("") ytitle("") noci
        
        // then
        graph combine left right, ycommon imargin(0)
        Thank you very much Jospeh!

        That is close to my expectations but I think I want it a little bit different.

        Below is an example. The x-axis is divided into high and low (indepent variable) and the two lines (also high and low differentiated) depict the moderator variable.

        Does someone has an idea how to rebuild this graph?

        Thanks!
        Click image for larger version

Name:	Unbenannt.jpg
Views:	1
Size:	32.7 KB
ID:	1373878

        Comment


        • #5
          Originally posted by Frank Taumann View Post
          I don't exactly understand what you mean with the distinctions. But if you want to show different graphs for a subset of you cash variable (a 'high' and a 'low' graph), why not build a new variable that only has the high values and all else missings and a 'low' variable which has the low values and all else missings and twoway plot these one by one with Tobins Q on the y axis? Stata should get rid of the missings automatically. I'm sure there is some faceting command somewhere but I'm not a Stata expert.
          Thanks Frank!

          I've already tried this. I generated maximum and minimum values of my independent and moderator variable and ran a regression. But I get a problem with collinearity.

          Do you have any idea how to fix it?

          Thanks!

          Comment


          • #6
            I think your problem is that your regression doesn't correspond to what you want to graph. Your regression explicitly treats cash and m_a as continuous variables. But you want to treat them as dichotomous low/high variables in your graph. You can't have it both ways. So you need to create dichotomous variables, do the regression with them, and then use -marginsplot-. So something like this:

            Code:
            gen m_a_high = (m_a > cutoff_between_high_and_low) if !missing(m_a)
            gen cash_high = (cash > cutoff_between_cash_high_and_low) if !missing(cash)
            xtreg Tobins_Q i.cash_high##i.m_a_high, fe // INCLUDE OTHER COVARIATES AS APPROPRIATE
            margins cash_high#m_a_high
            marginsplot
            You might need to specify the -xdimension()- option in -marginsplot- to get cash rather than m_a on the x-axis. Also, -marginsplot- will accept nearly all -graph twoway- options, so you can customize the appearance. And you probably want to put value labels and variable labels on these dichotomous variables.

            That said, I would not dichotomize these variables. It just throws away information. And particularly in a model with interactions, a graph with two lines connecting each of two points creates a very misleading view of what is going on as the relationships are not actually linear. I would instead pick interesting values of cash and m_a and do this with the continuous variables. Let's suppose, for illustrative purposes, interesting values of cash are 1000 5000 10000 20000 and 50000, and let's suppose the interesting values of m_a are 2 4 6 8 and 10. Then I would do this:

            Code:
            xtreg Tobins_Q c.cash##c.m_a, fe // INCLUDE OTHER COVARIATES AS APPROPRIATE
            margins, at(cash = (1000 5000 10000 20000 50000) m_a = (2(2)10))
            marginsplot, xdimension(cash)

            Comment


            • #7
              I share previous comments about the possibility for misleading when these kind of graphs are presented, however it is a common technique in the behavioral sciences to present these graphs when using interactions between two continuous variables. I lean on the previous example and show a way to produce this graph in Stata.

              Code:
              sysuse auto, clear
              
              //RUN YOUR REGRESSION
              regress length c.headroom##c.trunk
              
              //STORE THE RESULTS
              est sto regression
              
              //DEFINE HIGH AND LOW VALUES BASED ON +-1SD FROM THE MEAN OF THE ESTIMATED SAMPLE
              foreach v of var headroom trunk {
                  su `v' if e(sample)
                  local low_`v'=r(mean)-r(sd)
                  local high_`v'=r(mean)+r(sd)
              }    
              
              // LOAD BACK YOUR RESULTS
              est restore regression
              
              // CALCULATE THE MARGINAL PREDICTIONS
              margins , at(headroom=(`low_headroom' `high_headroom') trunk = (`low_trunk' `high_trunk'))
              
              //PLOT THE RESULTS
              marginsplot , title("") ylabel( , angle(horizontal) nogrid) noci

              Comment


              • #8
                Originally posted by Clyde Schechter View Post
                Then I would do this:

                Code:
                xtreg Tobins_Q c.cash##c.m_a, fe // INCLUDE OTHER COVARIATES AS APPROPRIATE
                margins, at(cash = (1000 5000 10000 20000 50000) m_a = (2(2)10))
                marginsplot, xdimension(cash)
                I'm not sure that that will work. I think that with two continuous predictors you'll end up having to work outside of marginsplot.

                The following will get a graph just like what Michael shows in #4. The code below assembles the predictions from margins in the manner shown in StataCorp's YouTube video for plotting predictions from the interaction of two continuous predictors. (The link is given in the help file for marginsplot: Profile plots and interaction plots, part 5: Interactions of two continuous variables.)
                Code:
                version 14.2
                
                clear *
                set more off
                
                quietly sysuse auto
                regress length c.headroom##c.trunk
                
                centile headroom, centile(25 75)
                forvalues i = 1/2 {
                    local h`i' = r(c_`i')
                }
                
                centile trunk, centile(25 75)
                forvalues i = 1/2 {
                    local t`i' = r(c_`i')
                }
                
                margins , ///
                    at(headroom = `h1' trunk = `t1') at(headroom = `h1' trunk = `t2') ///
                    at(headroom = `h2' trunk = `t1') at(headroom = `h2' trunk = `t2')
                
                matrix define R = r(at), r(b)'
                drop _all
                svmat double R, name(col)
                
                graph twoway ///
                    line r1 trunk if headroom == 2.5, lcolor(black) || ///
                    line r1 trunk if headroom == 3.5, lcolor(red) lpattern(dash) ///
                        ytitle(Car Length (in.)) ylabel( , angle(horizontal) nogrid) ///
                        xtitle(Trunk space (cu. ft.)) ///
                        text(187 16 "Headroom = 2.5 in.") text(192 14 "Headroom = 3.5 in.") ///
                        legend(off)
                
                exit
                Michael can add more at() points than the first and third quartiles that I've shown, but with a linear model and simple interaction, it's liable to clutter the profile plot without a corresponding trade off of being more informative.

                Edited to add: Seeing Oded's example, it seems that putting both combinations inside a single at() will allow marginsplot to work with the interaction.
                Last edited by Joseph Coveney; 12 Feb 2017, 01:09.

                Comment


                • #9
                  Joseph,

                  It does work:

                  Code:
                  webuse nlswork, clear
                  
                  xtreg ln_wage c.age##c.hours
                  margins, at(age = (15(5)40) hours = (40(10)80))
                  marginsplot, xdimension(hours)
                  If the fewer points are desired, that's just a matter of specifying fewer values in the -margins- command. The confidence interval bars can be suppressed with the -noci- option. It differs from the original request only in that it covers a wider range of values of both variables and doesn't dichotomize anything. What you posted also works and, in concept, is the same except that you have chosen the 25th and 75th percentiles to be the values I referred to as interesting.

                  Comment


                  • #10
                    Originally posted by Oded Mcdossi View Post
                    I share previous comments about the possibility for misleading when these kind of graphs are presented, however it is a common technique in the behavioral sciences to present these graphs when using interactions between two continuous variables. I lean on the previous example and show a way to produce this graph in Stata.

                    Code:
                    sysuse auto, clear
                    
                    //RUN YOUR REGRESSION
                    regress length c.headroom##c.trunk
                    
                    //STORE THE RESULTS
                    est sto regression
                    
                    //DEFINE HIGH AND LOW VALUES BASED ON +-1SD FROM THE MEAN OF THE ESTIMATED SAMPLE
                    foreach v of var headroom trunk {
                    su `v' if e(sample)
                    local low_`v'=r(mean)-r(sd)
                    local high_`v'=r(mean)+r(sd)
                    }
                    
                    // LOAD BACK YOUR RESULTS
                    est restore regression
                    
                    // CALCULATE THE MARGINAL PREDICTIONS
                    margins , at(headroom=(`low_headroom' `high_headroom') trunk = (`low_trunk' `high_trunk'))
                    
                    //PLOT THE RESULTS
                    marginsplot , title("") ylabel( , angle(horizontal) nogrid) noci
                    Hey Oded, thanks for your help!

                    The code looks fine but i get the error "invalid numlist has too few elements".

                    Do you have any idea how to fix that?

                    Thanks!

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      Joseph,

                      It does work:

                      Code:
                      webuse nlswork, clear
                      
                      xtreg ln_wage c.age##c.hours
                      margins, at(age = (15(5)40) hours = (40(10)80))
                      marginsplot, xdimension(hours)
                      If the fewer points are desired, that's just a matter of specifying fewer values in the -margins- command. The confidence interval bars can be suppressed with the -noci- option. It differs from the original request only in that it covers a wider range of values of both variables and doesn't dichotomize anything. What you posted also works and, in concept, is the same except that you have chosen the 25th and 75th percentiles to be the values I referred to as interesting.
                      Hello Joseph,

                      thanks for your help. It works!

                      But I still have a question: Are the linear predictions for age and hours, the predictions for the dependent variable?

                      Thanks!

                      Comment


                      • #12
                        Originally posted by Clyde Schechter View Post
                        It does work:
                        I know. See the edited-to-add of my post. I had misread your -margins- line as having two -at()- options.

                        Comment


                        • #13
                          Hello everyone,

                          thanks for all of your help!

                          I am now struggling with another problem which is even more complex.

                          As a next step, I combine two independent variables to a bundle. To do so, both independent variables are dichotomous (high /low) as you can see in the picture below. Now, I want to add an interaction with the bundle and my moderator variable and plot it. But I have no idea how to show the results in a graph. There should be two surfaces for high values and low values of my moderator variable.

                          Can anyone help me?

                          Thanks!

                          Click image for larger version

Name:	22222.jpg
Views:	1
Size:	136.9 KB
ID:	1374033

                          Comment


                          • #14
                            In general I am not impressed by 3D graphs, they look pretty but are horrible when conveying information. In your case specifically, surfaces don't make sense as you have categorical variables and surfaces suggest that intermediate values also exist.

                            If you interact a variable with "a bundle of two binary variables", then you are just doing a three way interaction. Alternatively and equivalently, you can view this as there are now 4 different effects of your variable.

                            Code:
                            // open example data
                            sysuse nlsw88, clear
                            
                            // prepare the data
                            gen byte marst = !never_married + married if !missing(never_married, married)
                            label variable marst "marital status"
                            label define marst 0 "never married"    ///
                                               1 "widowed/divorced" ///
                                               2 "married"
                            label value marst marst
                            
                            gen byte black = race == 2 if !missing(race)
                            label variable black "respondent's race"
                            label define black 0 "not black" ///
                                               1 "black"
                            label value black black
                            
                            gen byte occat = cond(occupation < 3, 1,                    ///
                                             cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                                             if !missing(occupation)
                            label variable occat "occupational category"
                            label define occat 1 "white collar" ///
                                               2 "skilled"      ///
                                               3 "unskilled"
                            label value occat occat
                            
                            // the "bundle of binary variables" is in this case
                            // union and colgrad
                            // the moderator is ttl_exp
                            
                            // estimate a model
                            glm wage i.union##i.collgrad##c.ttl_exp i.black i.occat, link(log) vce(robust)
                            
                            //
                            margins, dydx(ttl_exp) over(union collgrad)
                            marginsplot
                            (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq )
                            ---------------------------------
                            Maarten L. Buis
                            University of Konstanz
                            Department of history and sociology
                            box 40
                            78457 Konstanz
                            Germany
                            http://www.maartenbuis.nl
                            ---------------------------------

                            Comment


                            • #15
                              Originally posted by Maarten Buis View Post
                              In general I am not impressed by 3D graphs, they look pretty but are horrible when conveying information. In your case specifically, surfaces don't make sense as you have categorical variables and surfaces suggest that intermediate values also exist.

                              If you interact a variable with "a bundle of two binary variables", then you are just doing a three way interaction. Alternatively and equivalently, you can view this as there are now 4 different effects of your variable.

                              Code:
                              // open example data
                              sysuse nlsw88, clear
                              
                              // prepare the data
                              gen byte marst = !never_married + married if !missing(never_married, married)
                              label variable marst "marital status"
                              label define marst 0 "never married" ///
                              1 "widowed/divorced" ///
                              2 "married"
                              label value marst marst
                              
                              gen byte black = race == 2 if !missing(race)
                              label variable black "respondent's race"
                              label define black 0 "not black" ///
                              1 "black"
                              label value black black
                              
                              gen byte occat = cond(occupation < 3, 1, ///
                              cond(inlist(occupation,5, 6, 8, 9), 2, 3)) ///
                              if !missing(occupation)
                              label variable occat "occupational category"
                              label define occat 1 "white collar" ///
                              2 "skilled" ///
                              3 "unskilled"
                              label value occat occat
                              
                              // the "bundle of binary variables" is in this case
                              // union and colgrad
                              // the moderator is ttl_exp
                              
                              // estimate a model
                              glm wage i.union##i.collgrad##c.ttl_exp i.black i.occat, link(log) vce(robust)
                              
                              //
                              margins, dydx(ttl_exp) over(union collgrad)
                              marginsplot
                              (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq )
                              Hey Maarten,

                              thanks for your help!

                              I tried your code but I only get a 2 dimensional graph. Is it even possible to create a 3D graph like in my picture above?

                              Best,

                              Michael

                              Comment

                              Working...
                              X