Correlation with Panel Data

statauseroperations

Join Date: Oct 2014

Posts: 5
#1

Correlation with Panel Data

14 Oct 2014, 01:04

Hi, I'm new in using Stata and working with Panel Data. I want to report a descriptive statistic table in which I need to enter the correlation between variables. I'm not sure how we can estimate correlation between two variables using a panel data set. I think it must be different with the "corr" command that is used for cross sectional data.

Can anybody help me?

Thank you!
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#2

14 Oct 2014, 01:20

statauseroperations (please, as per FAQ, notice the preference for full real names in this forum):
do you mean correlation between variables or between coefficients?
If the latter is what you're after:

Code:

estat vce, corr

after panel data regression will do the trick.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
statauseroperations

Join Date: Oct 2014

Posts: 5
#3

14 Oct 2014, 09:03

Hi Carlo,

I meant correlation between variables. Thank you!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#4

14 Oct 2014, 11:08

statauseroperations (please, as per FAQ, notice the preference for full real names in this forum):
admittedly, I was never presented with such an issue. Hence, please take what follows as a temptative answer:

Code:

collapse (sum) var1 var2 , by(panel_id) corr var1 var2

if your data suggest a non-parametric correlation test (i.e., rank correlation coefficient), you can take a look at -help spearman- or -help ktau-.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
statauseroperations

Join Date: Oct 2014

Posts: 5
#5

14 Oct 2014, 13:19

thank you very much Carlo!

Here is the question: imagine we have a panel data that contains annual asset (var1) and annual performance (var2). we simply need to find the correlation between var1 and var2. if we collapse these variables, i think, we'll lose information because it converts our panel data to a cross sectional.

I also briefly read about Spearman's rank correlation coefficient. it seems it is fine for the cases where there is no repeated data values e.g., cross sectional format; right?

Best regards,
p.s. I understood your comment related my full real name in this forum.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#6

14 Oct 2014, 13:28

As you have been asked repeatedly to change your identifier, please do so. Some of us care about this as good practice, and won't support members who differ.

That said, nothing stops you calculating a correlation for all the values of two variables in a dataset, including several different panels. It's hard to know how to interpret that unless you keep track also of whether the panels are similar or different in their correlation properties. Even pooling panels with similar correlation can be messy if the panels differ in their means.

Precisely nothing makes Spearman correlation either more or less appropriate for panel data than Pearson correlation.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17709
#7

15 Oct 2014, 00:17

statauseroperations (please, as per FAQ and after three undestood kind reminders, it's time to re-register with your full real names. Just click on the Contact us button at the bottom of the screen page and follow the instructions):
-after some practice with a Stata dataset, if you're interested in two variables only, -collapse- is probably out of debate:

Code:

use "http://www.stata-press.com/data/r13/nlswork.dta", clear corr ln_wage union display r(rho)^2 xtreg ln_wage union, re /// same result with the fixed-effect specification display e(r2_o)

As you will see -display r(rho)^2- and -display e(r2_o)- give the same result.

Kind regards,
Carlo

Kind regards,
Carlo
(Stata 19.0)
Comment
ben earnhart

Join Date: May 2014

Posts: 1027
#8

15 Oct 2014, 01:21

Basic idea is that when running regression models, covariances among the same variable/case across time/space will mess up estimates. But simple bi-variate correlations are just the total covariance, standardized. So for this purpose, it doesn't matter how much is due to time/space issues, and how much is due to substantive issues. It only gets chopped up into separate pieces with more complex models.
Comment
Changhyeon Song

Join Date: Dec 2017

Posts: 5
#9

22 Jan 2018, 05:40

you can use -xtsum- command

then you can find mean, SD, min, max and observations

- mean value is all same

- standard deviation differ among overall/ between/ within. #4 (Carlo) suggest 'between' calculation.
Comment
Zues

Join Date: May 2014

Posts: 12
#10

31 Oct 2021, 10:36

Generally (in most cases) what you want is to study the within panel variation of the variables over time. This is why you do the fixed effects command. If this is the case then you would want a within panel correlation coefficient.

To do this you just demean the variables within panels to set them all on the same playing field.
There are two ways to do this. The first is with coding:

1) Suppose you have 3 variables x1, x2, x3 and the panel is region then type:

by region: egen m1 = mean(x1)
by region: egen m2 = mean(x2)
by region: egen m3 = mean(x3)
gen dx1 = x1 - m1
gen dx2 = x2 - m2
gen dx3 = x3 - m3
reg dx1 dx2 dx3
corr dx1 dx2 dx3

2) You can simply use the xtdata command, but save under a different name first as the command deletes everything
xtdata x1 x2 x3, fe
Comment

Announcement

Correlation with Panel Data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment