Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • quasi-panel made out of cross-sectional data

    Hi all,
    Does anybody know how to create a quasi-panel data out of the cross-sectional data. I have a few surveys for a few years, nevertheless the individuals are not the same people, therefore I can not just merge the data. I can create a quasi-panel data, aggregating data at the regional level. I would like to get some hint how to perform it ( my data is a few separate surveys from about 6 years with same variables but different individuals).
    Thanks!,

  • #2
    Hey Kasia Kiki . What do you mean by quasi-panel? i've never heard this term before. Also, can you show an example of your current data so I can have something to work with?

    Comment


    • #3
      Kasia, what you described is called "synthetic panel" in my mind. First, you may use -append- to combine different datasets (make sure the same variables have the same names across datasets.), then you may use -collapse- to generate the synthetic panel from individual data. Some example code as below.

      Code:
      * Combine datasets
          use data1, clear
          append using  data2 data3 data4...
      
      * From individual pooled cross-sectional data to synthetic panel
          collapse (mean) varlist (rawsum) weight [pw=weight], by(regionid year)
      In the collapse part, "varlist" means the list of variables that you'd like to aggregate at the region level, "weight" means the original individual sampling weights -- if individuals are equally weighted, then you may generate weight = 1 before collapsing. In the synthetic panel, a weight variable is necessary to reflect the total individual weights within a synthetic group.
      Last edited by Fei Wang; 04 Nov 2021, 18:24.

      Comment


      • #4
        Thank you a lot! it worked, nevertheless, I do not know if the "mean" approach is correct, because my dependant variable is a categorical one. So I am not sure if it makes sense with that. I think the other approach is needed since while collapsing on a geohraphical level, dependant variable took values of 0.54, 0.45 etc, which doesn't make sense.
        Last edited by Kasia Kiki; 05 Nov 2021, 09:48.

        Comment


        • #5
          Originally posted by Kasia Kiki View Post
          Thank you a lot! it worked, nevertheless, I do not know if the "mean" approach is correct, because my dependant variable is a categorical one. So I am not sure if it makes sense with that.
          Right, averaging categorical variables may not make sense (except for happiness indicators, etc.). Then I would suggest that, in the original data, you generate dummies for the categorical variable, like

          Code:
          tab var, by(var)
          While collapsing, calculate the mean of each dummy, and you will get the fraction of individuals in a specific category within a region-year -- That's something meaningful for further analysis.

          Comment

          Working...
          X