Multiple missing values in Panel Data

Muhammad Yousaf Malik

Join Date: Nov 2022

Posts: 7
#1

Multiple missing values in Panel Data

16 Nov 2022, 09:19

Hi,
I am trying to run the xtreg command for the panel data. I have attached the summary table, where the dependent variable is missing 616 values. The dependent variable is Chinese outward foreign direct investment (OFDI), showing multiple missing values over the years in multiple countries. In terms of data, we can say that the dependent variable has missing values, but in reality, it means that China only invested in particular countries in specific years. I have run the xtreg using original regression and also using the imputed values. I adopted the method of imputation by following the article https://phantran.net/multiple-imputa...lues-in-stata/.

Though the imputed results remain consistent with the original results, the imputed results show most variables to be significant. I would be grateful if you could guide me to use which results in my paper (imputed or original). Also, can we run the Granger Causality using the imputed dataset since the original dataset is unbalanced?
Attached Files
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#2

16 Nov 2022, 10:26

Muhammad:
before worrying about missing data, I would check via -xttest0- whether your results shows evidence of a panel-wise effect.

Kind regards,
Carlo
(Stata 19.0)
Comment
Muhammad Yousaf Malik

Join Date: Nov 2022

Posts: 7
#3

16 Nov 2022, 10:49

Hi,
Please find the xttest0 attached.
Attached Files
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#4

16 Nov 2022, 11:18

Muhammad:
as the -xttest0- outcome rejects the null, your data show evidence of a panel-wise effect.
That said, I would consider the option to stick with your original unbalanced panel dataset, as missingness is explained by the fact that

...China only invested in particular countries in specific years

.

Kind regards,
Carlo
(Stata 19.0)
Comment
Muhammad Yousaf Malik

Join Date: Nov 2022

Posts: 7
#5

16 Nov 2022, 13:40

Hi Carlo,
Thanks for your prompt replies. I would be grateful if you could also provide a solution to carry out the Granger Causality test on the original data set. Considering some replies in another post, converting the unbalanced panel data into balanced panel data significantly reduces the number of observations. I am hoping for your reply.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#6

16 Nov 2022, 16:59

Muhamnad:
1) see -help vargranger-;
2) I'm afraid but I do not follow your statement about the reduction in the number of observations due to the coversion of an unbalanced panel into its balanced counterpart.

Kind regards,
Carlo
(Stata 19.0)
Comment
Muhammad Yousaf Malik

Join Date: Nov 2022

Posts: 7
#7

17 Nov 2022, 12:39

Hi Carlo,
Thanks for the reply. From the original dataset (OECD + non-OECD), I take a sample of OECD countries. After running the -xtreg-, I also carry the -xttest0-, where null is not rejected, thus suggesting to use of a pooled regression. I carry out the pooled regression for these OECD countries' panel data by adopting the method shown in this article https://www.projectguru.in/pooled-pa...ression-stata/.

In some of your old posts regarding the pool regression for the panel data, you commented that 'pooled OLS seldom outperforms -xtreg- when you deal with a panel dataset.' (https://statalist.org/forums/forum/g...nel-regression). I have attached the result for both the -xtreg- and pooled regression, where both show similar findings. Which result should I choose to present in the paper? Also, for your statement ''pooled OLS seldom outperforms -xtreg- when you deal with a panel dataset.' is there any reference paper that I can add in the methodological part to highlight why we still used the -xtreg-, even the LM test showed to use pooled regression. I am hoping for your reply.
Attached Files
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#8

18 Nov 2022, 01:10

Muhammad:
1) I meant that starting off with a pooled OLS when dealing with a panel dataset should not be researcher's first choice as, if there's evidence that a panel-wise efffect exists, -xt- commands do it better;
2) conversely, in your case you are forced to switch to a pooled OLS as your dataset does not support the evidence of a panel-wise effect.
As an aside, it seems that you're using a -wide- format in your -regress-. It's much better to -reshape- and adopt the -long-layout.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Multiple missing values in Panel Data

Comment

Comment

Comment

Comment

Comment

Comment

Comment