Why am I experiencing high P-Value and positive correlation on this?

Bob Ross

Join Date: Sep 2018

Posts: 9
#1

Why am I experiencing high P-Value and positive correlation on this?

29 Sep 2018, 13:22

I'm a bit confused about my dummy variable results. I am running a regression on log GDP. the three dummies are for years with recession. Recession 2 and 3 are positively correlated with GDP, which doesn't make sense and my P-Values are huge. Can someone give me some thoughts on what my error might be and how to resolve it?
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

29 Sep 2018, 20:46

You shared a snapshot. Please read the FAQ, particularly the topic about sharing data/command/output. There you will also find the recommendation to avoid snapshots.

That being said, the command seems to be mistyped, for there is a space between the factor notation and the first dummy concerning recession.

I am also wondering whether the factor notation couldn't be used to encompass all three dummies at once.

Best regards,

Marcos
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4387
#3

29 Sep 2018, 20:55

From what I see with your first recession, your indicator variables might all be coded 0/1, and if so this won't make much difference, but you could try

Code:

xtreg lGDP lmilitaryspending lcapitalformation i.(recession1 recession2 recession3), fe

or

Code:

xtreg lGDP lmilitaryspending lcapitalformation i.recession?, fe

You might also want to look at

Code:

regress lGDP lmilitaryspending lcapitalformation i.(recession1 recession2 recession3 id) estat vif
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17694
#4

30 Sep 2018, 05:39

Bob:
as an aside to previous helpful replies, at a first glance you are probably experiencing a quasi-extreme multicollinearity issue with your data.
In the same fashion of Marcos and Joseph, I would also recommend you to get rid of creating categorical variables and/or interaction by hand and be aware of the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_..._(statistics): using -fvvarlist- will improve your way of coding and eliminate the risk of incurring in dummy trap pitfall.
Eventually, you are seemingly dealing with a T>N panel dataset; if that were the case, take a look at -xtgls-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Bob Ross

Join Date: Sep 2018

Posts: 9
#5

30 Sep 2018, 07:26

Originally posted by Marcos Almeida View Post

I am also wondering whether the factor notation couldn't be used to encompass all three dummies at once.

Hi marco,

thanks for the reply. I did include it as well. I did gen recession= (Year==2001| Year==2008|Year==2009) to create one dummy variable. the coefficient was -.002782 which is in the right direction. However, the p-value still remained 0.953 as statistically insignificant.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35528
#6

30 Sep 2018, 07:31

Please note our policy on cross-posting, which is that you are asked to tell us about it. This is spelled out in the FAQ Advice all are asked to read before posting.

I allude to a concurrent thread on Reddit, which has featured some pertinent comments.
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#7

30 Sep 2018, 08:17

Hi marco,

thanks for the reply. I did include it as well. I did gen recession= (Year==2001| Year==2008|Year==2009) to create one dummy variable. the coefficient was -.002782which is in the right direction. However, the p-value still remained 0.953 as statistically insignificant

Thanks for the information. My sugestion actually concerned the use of factor notation to deal with a categorical variable instead of creating several dummies for that matter, with the pitfalls Carlo described.

Best regards,

Marcos
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#8

01 Oct 2018, 07:51

I'm not an economist. But my (possibly naive) thought is that you're running a regression on country-year data, without controlling for year. If you had no dummies for recession, your constant would represent the mean log GDP over each country's time series (with military spending and capital formation set to 0). With the recession dummies, you essentially find that the adjusted log GDP in years 2008 and 2009 is not distinguishably different from the mean log GDP across the entire time series. I can't quite explain why, but I suspect if you add year to your regression as a continuous variable, you could see something different.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#9

02 Oct 2018, 12:37

I can imagine a variety of problem with this model. If the data are not in constant dollars, then you may simply be showing that inflation matters. Alternatively, it is possible that high GDP facilitates military spending instead of the opposite. I'd also worry about the recession variables - they're generated probably based on change in GDP or something very close so including them on the rhs may be problematic in a model explaining GDP.
Comment

Announcement

Why am I experiencing high P-Value and positive correlation on this?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment