Multivariate regression with perfectly correlated residuals

Valeria Azarova

Join Date: Nov 2016

Posts: 2
#1

Multivariate regression with perfectly correlated residuals

15 Nov 2016, 02:54

this question asks for help for a statistical/econometric problem I can't get the clue:
I have a classic multivariate regression problem, i.e. dependent variables are stored in matrix Y having dimension n x p. So p observations come from the same respondent i and we need to expect correlated residuals. The explanatory variables are stored in X and as standard in the multivariate regression model, each of the p observations of respondent i has the same common X. Say Yi holds the income of a guy at 5 points in time, and it shall be explained by his gender and his education only, then Xi holds male and college as the explanatory variables for all of his Yi. As an output of the regression one gets vectors β1,...,β5, one vector β for each of the 5 points in time.
In my specific problem some of the Y have been constructed in such a way that the correlation of their errors is 1. So for these I would not need 5 vectors β1,...,β5, but because of the prefect correlation a reduced form would already contain all information, like β1=βα1, β2=βα2, ... Where each α is a skalar and not a vector.
I understand that the multivariate regression, i.e. estimating 5 vectors β1,...,β5, should give unbiased but inefficient results (true?). Nevertheless, does anyone of you have an idea how to estimate the model from above without the redundant β-vectors?
Also running sureg does not work properly(error:Covariance matrix of errors is singular), so as an alternative I used mvreg.
Is there a better solution?
Thanks for any hint!
Tags: correlated residuals, multivariate regression, mvreg, sureg
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

16 Nov 2016, 08:20

I cannot speak to the specifics of your estimation, as you noted when the error are perfectly correlated, you probably have a redundant equation. That is, knowing the results on all but one of the dv's lets you perfectly predict the last one. You can always move to SEM and model these directly. But the normal issue when this occurs is not a statistical one but rather a model specification one. I'd move back to think about why these come out this way.

Please read the FAQ on asking questions.
1 like
Comment
Valeria Azarova

Join Date: Nov 2016

Posts: 2
#3

17 Nov 2016, 04:44

Dear Phil,
Thanks a lot for your answer!
I am quite new to research so sorry in advance if my further question is inappropriate.
But maybe you can help me: the idea of the research was to take the dependent variable 'y' which we observe from the data and which consists from 2 parameters ('a' and 'b') and regress it with normal OLS on several independent variables. That is our basic scenario. Further on we want to create new scenarios and to see how the effect of independent variables will change if we change the proportion of 'a' and 'b' in the dependent variable. So I generate new variables 'y1' etc by keeping the sum of the values of dependent variable 'y' from the sample the same and changing the proportions of the parameters 'a' and 'b'- after that we calculate the difference between these new variables and original and the differences are my new dependent variables ‘s1’,’s2’, which I regress on independent variables from the original scenario.

That is the correlation matrix for new scenarios that I get:

. cor s1 s2 s3 s4 s5 s6 s7
(obs=396)

| s1 s2 s3 s4 s5 s6 s7
-------------+---------------------------------------------------------------
s1 | 1.0000
s2 | 1.0000 1.0000
s3 | 1.0000 1.0000 1.0000
s4 | 1.0000 1.0000 1.0000 1.0000
s5 | -1.0000 -1.0000 -1.0000 -1.0000 1.0000
s6 | 0.6121 0.6121 0.6121 0.6121 -0.6121 1.0000
s7 | 0.5693 0.5693 0.5693 0.5693 -0.5693 0.9986 1.0000

So in my case from research perspective, is it ok to use mvreg and present results from all the scenarios or is it against some rules of research modelling?
I could not find anything in the literature except for the advice to combine all the variables in one for more efficient results; but in my case I explicitly want to see the effect on these different dependent variables.

Maybe there is a more elegant solution to check for these new scenarios?

Thanks in advance for your help!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17708
#4

17 Nov 2016, 06:21

Valeria:
something similar to the topic you're interested in was addressed at: http://www.statalist.org/forums/foru...ndent-variable

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Multivariate regression with perfectly correlated residuals

Comment

Comment

Comment