For my research I need to pool 2 datasets for the same country but for 2 different years.
I'm planning to use svy.
But I have an issue with identifying the PSUs. As for the strata the number and location of regions is the same for both datasets, but as for PSU the numbers are different and the PSUs are different. In dataset for year 1 there are, say, 600 PSUs while in dataset for year 2 there are 800 PSUs.
How to define PSU in appended datasets? I've read about super_stratum which is
egen super_strata = group (year region residence_of_region) which is not necessary in my case since the regions are the same.
As for PSU it would be
egen psu = group (year cluster)
But this will bring the combined number of PSUs.
I don't fully understand which path should I follow and if it would be correct to have combined number of PSUs...
I'm planning to use svy.
But I have an issue with identifying the PSUs. As for the strata the number and location of regions is the same for both datasets, but as for PSU the numbers are different and the PSUs are different. In dataset for year 1 there are, say, 600 PSUs while in dataset for year 2 there are 800 PSUs.
How to define PSU in appended datasets? I've read about super_stratum which is
egen super_strata = group (year region residence_of_region) which is not necessary in my case since the regions are the same.
As for PSU it would be
egen psu = group (year cluster)
But this will bring the combined number of PSUs.
I don't fully understand which path should I follow and if it would be correct to have combined number of PSUs...
Comment