quasi-panel made out of cross-sectional data

Kasia Kiki

Join Date: Nov 2021

Posts: 5
#1

quasi-panel made out of cross-sectional data

04 Nov 2021, 16:49

Hi all,
Does anybody know how to create a quasi-panel data out of the cross-sectional data. I have a few surveys for a few years, nevertheless the individuals are not the same people, therefore I can not just merge the data. I can create a quasi-panel data, aggregating data at the regional level. I would like to get some hint how to perform it ( my data is a few separate surveys from about 6 years with same variables but different individuals).
Thanks!,
Tags: None
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#2

04 Nov 2021, 17:46

Hey Kasia Kiki . What do you mean by quasi-panel? i've never heard this term before. Also, can you show an example of your current data so I can have something to work with?
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#3

04 Nov 2021, 18:19

Kasia, what you described is called "synthetic panel" in my mind. First, you may use -append- to combine different datasets (make sure the same variables have the same names across datasets.), then you may use -collapse- to generate the synthetic panel from individual data. Some example code as below.

Code:

* Combine datasets use data1, clear append using data2 data3 data4... * From individual pooled cross-sectional data to synthetic panel collapse (mean) varlist (rawsum) weight [pw=weight], by(regionid year)

In the collapse part, "varlist" means the list of variables that you'd like to aggregate at the region level, "weight" means the original individual sampling weights -- if individuals are equally weighted, then you may generate weight = 1 before collapsing. In the synthetic panel, a weight variable is necessary to reflect the total individual weights within a synthetic group.

Last edited by Fei Wang; 04 Nov 2021, 18:24.
Comment
Kasia Kiki

Join Date: Nov 2021

Posts: 5
#4

05 Nov 2021, 09:15

Thank you a lot! it worked, nevertheless, I do not know if the "mean" approach is correct, because my dependant variable is a categorical one. So I am not sure if it makes sense with that. I think the other approach is needed since while collapsing on a geohraphical level, dependant variable took values of 0.54, 0.45 etc, which doesn't make sense.

Last edited by Kasia Kiki; 05 Nov 2021, 09:48.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#5

05 Nov 2021, 09:51

Originally posted by Kasia Kiki View Post

Thank you a lot! it worked, nevertheless, I do not know if the "mean" approach is correct, because my dependant variable is a categorical one. So I am not sure if it makes sense with that.

Right, averaging categorical variables may not make sense (except for happiness indicators, etc.). Then I would suggest that, in the original data, you generate dummies for the categorical variable, like

Code:

tab var, by(var)

While collapsing, calculate the mean of each dummy, and you will get the fraction of individuals in a specific category within a region-year -- That's something meaningful for further analysis.
Comment

Announcement

quasi-panel made out of cross-sectional data

Comment

Comment

Comment

Comment