Seemingly Unrelated Regression

Huseyin Unal

Join Date: Dec 2019

Posts: 5
#1

Seemingly Unrelated Regression

18 Oct 2023, 02:00

Hi Everyone,

I want to conduct analysis using SUR method. There would be 4 different systems in which 4 different dependent variables are y1 y2 y3 y4. But explanatory variables are same across the systems: x1 x2 x3 x4 x5. For each system there would be 5 different equations composed of different country samples because the countries are not necessarily same in each wave. Actually I want to do something like McCleary and Barro (2006) did in their paper:

However, I could not understand how they reach single coefficients out of 6 equations in each system. I could not find any option like this in whether sureg or xtsur command.

I have a dataset composed of data from 5 different survey waves looking like this:

wave year ctry y1 y2 y3 y4 x1 x2 x3 x4 x5
1 1980 A
1 1980 B
. . .
. . .
. . .
1 1980 G
1 1980 H
2 1990 D
2 1990 E
. . .
. . .
2 1990 X
2 1990 Y
3 2000 A
3 2000 B
. . .
. . .
3 2000 G
3 2000 H
4 2010 J
4 2010 K
. . .
. . .
. . .
4 2010 Y
4 2010 Z
5 2020 A
5 2020 B
. . .
. . .
. . .
5 2020 G
5 2020 H

Code:

xtset ctry year xtset (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5)

This code,not surprisingly, gives results for the whole country sample and years.

Or should I run sureg for each wave seperately, then take the average of the coefficients for corresponding dependent variables to achieve results like McCleary and Barro (2006).

Code:

sureg (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5) if wave==1 sureg (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5) if wave==2 sureg (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5) if wave==3 sureg (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5) if wave==4 sureg (y1 x1 x2 x3 x4 x5) (y2 x1 x2 x3 x4 x5) (y3 x1 x2 x3 x4 x5) (y4 x1 x2 x3 x4 x5) if wave==5

I mean taking average of the all x1's,x2's,x3's,x4's,x5's where the dependent variable is y1, and doing it for the rest y2, y3, y4.

I am at a point really stucked in my research, Your response would be more than helpful. Thank you in advance.
Tags: None
George Ford

Join Date: Aug 2014

Posts: 3187
#2

18 Oct 2023, 13:17

If your X's are the same for each model there is no point in doing SUR.
Comment
Huseyin Unal

Join Date: Dec 2019

Posts: 5
#3

19 Oct 2023, 03:47

Originally posted by George Ford View Post

If your X's are the same for each model there is no point in doing SUR.

I am partly agree with you. But Wooldridge says: "It is important to know when every equation contains the same regressors in a SUR system, there is still a good reason to use a SUR software routine in obtaining the estimates: we may be interested in testing joint hypotheses involving parameters in different equations". My question above still holds. The one about how to get different equations for different country samples in each system having different dependent variables but having the same explanatory variables.
Comment
George Ford

Join Date: Aug 2014

Posts: 3187
#4

19 Oct 2023, 08:00

use constraints within sureg where all the coefficients across the systems are the same for each x.

(that won't work. can only constrain within one set).

Last edited by George Ford; 19 Oct 2023, 08:20.
Comment

George Ford

Join Date: Aug 2014
Posts: 3187

19 Oct 2023, 09:54

Briefly looking over the McCleary/Barro paper, I would be looking for a better estimation approach. There are constraints imposed in that model that make no sense, and at a minimum should have been tested (e.g., equal constants across waves and no country fixed effects). ["Constant terms, not shown, are included for each system. The constants vary by system but not across the equations within a system."]

Honestly, it's a bit of a mystery what they're up to and I don't believe the results.

The most straightforward approach is reghdfe absorbing wave and countryid. It's just pooled cross section data. The FE will account for differences in countries included. Not including country FE seems problematic since countries vary a lot for a lot of reasons (OV problem).
reghdfe allows you to create the centered variables, which you could use to estimate a reg and then use suest to deal with correlated errors.

There are several ways one might get a single coefficients using sureg, including using margins or imposing constraints. I get different results depending on which one I use, all of them pretty far from the true coefficients (unless wave FE are included).

Here's some play, but I'm just noodling and there may be errors. I've included country and wave FE to show the effects (commented out here).

I'd definitely try lots of approaches to see if they provide comparable results to a basic reghdfe model.

Code:

clear all

set obs 5

g wave = _n

forv i = 1/5 {
    g x`i' = rnormal()
}

expand 10000

bys wave: g id = _n
xtset id wave


forv i = 1/5 {
    replace x`i' = x`i' + rnormal()/10
}

g y1 = 1 + 1*x1 + 1*x2 + 1*x3 + 1*x4 + 1*x5 + rnormal() //+ 0.1*id + wave*runiform(1,4)/10
g y2 = 1 + 1*x1 + 1*x2 + 1*x3 + 1*x4 + 1*x5 + rnormal() //+ 0.1*id + wave*runiform(1,4)/10
g y3 = 1 + 1*x1 + 1*x2 + 1*x3 + 1*x4 + 1*x5 + rnormal() //+ 0.1*id + wave*runiform(1,4)/10
g y4 = 1 + 1*x1 + 1*x2 + 1*x3 + 1*x4 + 1*x5 + rnormal() //+ 0.1*id + wave*runiform(1,4)/10

forv i = 1/4 {
    reghdfe y`i' x1 x2 x3 x4 x5, absorb(wave id)
}

sureg (y1 x1 x2 x3 x4 x5 ) (y2 x1 x2 x3 x4 x5 ) (y3 x1 x2 x3 x4 x5 ) (y4 x1 x2 x3 x4 x5 )
sureg (y1 x1 x2 x3 x4 x5 i.wave) (y2 x1 x2 x3 x4 x5 i.wave) (y3 x1 x2 x3 x4 x5 i.wave) (y4 x1 x2 x3 x4 x5 i.wave)

sureg (y1 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))     ///
      (y2 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))     ///
      (y3 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))     ///
      (y4 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))
** I use 0.2 since no missing obsrvations, but you'd need to account for variations in sample size for each wave
forv i = 1/4 {
    lincom [y`i']1.wave#c.x1*0.2 + [y`i']2.wave#c.x1*0.2 + [y`i']3.wave#c.x1*0.2 + [y`i']4.wave#c.x1*0.2 + [y`i']5.wave#c.x1*0.2
}

sureg (y1 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)     ///
      (y2 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)     ///
      (y3 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)     ///
      (y4 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)
forv i = 1/4 {
    lincom [y`i']1.wave#c.x1*0.2 + [y`i']2.wave#c.x1*0.2 + [y`i']3.wave#c.x1*0.2 + [y`i']4.wave#c.x1*0.2 + [y`i']5.wave#c.x1*0.2
}

local c = 1
forv y = 1/4 {
    forv x = 1/5 {
        forv w = 2/5 {
            constraint `c' [y`y']1.wave#c.x`x' = [y`y']`w'.wave#c.x`x'
            local c = `c'+1
        }
    }
}
di `c'

sureg (y1 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))     ///
    (y2 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5))     ///
    (y3 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5)) ///
    (y4 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5)) ///
    , constraints(1-81)

sureg (y1 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)     ///
    (y2 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave)     ///
    (y3 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave) ///
    (y4 i.wave#(c.x1 c.x2 c.x3 c.x3 c.x5) i.wave) ///
    , constraints(1-81)

Comment

George Ford

Join Date: Aug 2014

Posts: 3187
#6

19 Oct 2023, 10:47

In their JEP paper (Religion and Economy, Journal of Economic Perspectives—Volume 20, Number 2—Spring 2006—Pages 49 –72 ), they say "The coefficients shown in each cell come from joint estimation that pools all of the data from different surveys at different points in time."
Comment
Huseyin Unal

Join Date: Dec 2019

Posts: 5
#7

20 Oct 2023, 14:04

I appreciate such a detailed answer. It changed my perspective. My focus was on their other paper (Religion and Political Economy in an International Panel, Journal for the Scientific Study of Religion (2006) 45(2):149–175). But the analysis is same I think.
Comment

Announcement