Interpretation of regression results of a combined OLS model

Rianne Steinbusch

Join Date: May 2018

Posts: 1
#1

Interpretation of regression results of a combined OLS model

25 May 2018, 06:44

Dear all,

I am writing my master thesis about the determinants of CSR. I have two separate OLS regression models (grounded in two separate theories) that try to explain variation in CSR (dep. variable). I hypothesize that Model 1 explains significantly more variation in CSR than Model 2, which is confirmed by means of a Vuong test.

As an additional test, I combine Model 1 and Model 2 into one regression. Model 1 and 2 have 4 variables in common, the other variables are specific to that model. I have difficulties interpreting the regression results of the combined model. Specificaly, what does it imply when:
- A significant variable in a separate model becomes insignificant in the combined model?
- An insignificant variable in a separate model becomes significant in the combined model?
- Are the answers on these two questions different for the 4 variables that are included in both models?

Thank you in advance!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#2

25 May 2018, 07:04

Rianne:
welcome to this forum.
I would first consider what kind of information your last attempt can convey, provided tht models do not share some predictors.
I would take a look at adjuster-R2, instead.

Kind regards,
Carlo
(Stata 19.0)
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1129

25 May 2018, 08:58

Hello Rianne. You can use information criteria such as AIC and BIC to compare non-nested models. Here is a silly little example using the auto dataset that comes with Stata.

Code:

clear *
sysuse auto
regress mpg trunk weight turn displacement gear_ratio
estimates store full
estat ic
generate goodrec = e(sample)
regress mpg trunk weight turn if goodrec
estat ic
estimates store m1
regress mpg turn displacement gear_ratio if goodrec
estat ic
estimates store m2
estimates table m1 m2 full
* Use -lrtest- with -force- option to get IC for both models in one table
lrtest m1 m2, stats force

Output from the last two commands:

Code:

. estimates table m1 m2 full

-----------------------------------------------------
    Variable |     m1           m2          full    
-------------+---------------------------------------
       trunk | -.09037984                -.09278739  
      weight |  -.0050613                -.00562765  
        turn | -.12630117   -.56119449   -.12610962  
displacement |              -.02066248    .00793443  
  gear_ratio |               .70639768    .59384312  
       _cons |  42.830698    45.494858    41.210475  
-----------------------------------------------------

. lrtest m1 m2, stats force

Likelihood-ratio test                                 LR chi2(0)  =     16.21
(Assumption: m2 nested in m1)                         Prob &gt; chi2 =         .

Akaike's information criterion and Bayesian information criterion

-----------------------------------------------------------------------------
       Model |        Obs  ll(null)  ll(model)      df         AIC        BIC
-------------+---------------------------------------------------------------
          m2 |         74 -234.3943  -202.9365       4     413.873   423.0892
          m1 |         74 -234.3943  -194.8317       4    397.6634   406.8796
-----------------------------------------------------------------------------
               Note: N=Obs used in calculating BIC; see [R] BIC note.

In this silly example, model 1 (m1) has the lower values for AIC and BIC, so would be considered the better fitting model.

Another tool you might find useful is the user-written command -fitstat-. You can read about it in this thread from 2015:

https://www.statalist.org/forums/for...te-aic-and-bic

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)

Comment

Bruce Weaver

Join Date: May 2014

Posts: 1129
#4

25 May 2018, 13:36

Originally posted by Rianne Steinbusch View Post

Dear all,

I have difficulties interpreting the regression results of the combined model. Specificaly, what does it imply when:
- A significant variable in a separate model becomes insignificant in the combined model?
- An insignificant variable in a separate model becomes significant in the combined model?
- Are the answers on these two questions different for the 4 variables that are included in both models?

Thank you in advance!

I forgot to respond to these questions earlier. If your library has this book by Mosteller & Tukey (1977), I recommend Chapter 13 (Woes of Regression Coefficients). In a nutshell, it says that the coefficient for any variable depends on what other variables are in the model.

The first situation you describe above sounds like (positive) confounding, and the second sounds like negative confounding (also called suppression in some fields). If you do a Google search on <positive vs negative confounding>, you should get several relevant hits.

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 18.5 (Windows)
Comment

Announcement