Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can multiple cross sectional data be treated as panel data?

    I have GSS data from multiple waves ie years for the same regions in the US.

    When I look at the dataset, the id numbers restart from 1 for every consecutive year, but different people are asked, so I want to renumber the IDs from 1 in the first year of the data I have, increasing, to the last. (so that they do not restart in the next year) How do I go about doing this? Is there a way of coding that?

    Also if the same questions are asked for the same regions but to different people, over time, can this be treated as panel data?

  • #2
    I want to renumber the IDs from 1 in the first year of the data I have, increasing, to the last.
    Code:
    sort year id
    egen id_uniq = group(year id)
    Note that if you just pool the data it doesn't really matter whether individuals have a unique id or not.
    Also if the same questions are asked for the same regions but to different people, over time, can this be treated as panel data?
    No, this is not panel data in the strict sense, rather repeated cross-sectional data. This doesn't mean that you can't look at changes over time. You can control for time fixed effects, analyze how things evolved over time for the whole sample, or sub-samples (men, women, regions) etc. What you cannot do is to analyze change at an individual level.

    Comment


    • #3
      Thank you, Wouter Wakker! This was very helpful

      Comment


      • #4
        In addition to Wouter's helpful comments: You can turn pooled cross sections into a pseudo-panel by aggregating the individual data. Often this is done by region. Then the panel data set is at the regional level.

        Comment


        • #5
          Deal all,
          I am a phd student and trying to understand the nature of female labor force participation in India. I am using World Value Survey data (wave 2 to 6). I have pooled all the cross sectional data into one, but stata is not taking my data set as panel data. When command- 'tsset' given it shows error like 'repeated time values in sample', 'repeated time values within panel'. Can anybody help me?

          Comment


          • #6
            Is your data panel data or repeated cross-sectional data? They should not be treated in the same way. If you have repeated cross sections there is no need to xtset or tsset your data in the first place.

            If you do have panel data (observations for the same individuals/firms/whatever at different times), Stata is telling you exactly what the problem is. You have multiple observations for the same time period within panels, but xtset and tsset require that you have only one observation per time point within panels. You have to find out why this is the case. Maybe you have duplicate observations which have to be dropped, or maybe your time variable is not set up in the right way.

            This problems comes up quite often on Statalist so there is already a wealth of information on the forum. This is also a good place to start:
            https://www.stata.com/support/faqs/d...d-time-values/

            Otherwise, a data example (using dataex), would help us to tell you what the problem is.

            Comment


            • #7
              Prafulla: Wouter's opening question is the key one. The World Values Survey is a series of repeated cross-section surveys for a large number of countries. It is not a survey in which the same respondents are tracked over time and reinterviewed. (The WVS uses the term "wave" to label the different survey rounds; perhaps that is what is confusing you.) Hence -xtset- and -tsset- are not applicable. (There is a literature on pseudo-panel data, derived from consistently-defined cohorts in repeated cross-sections, but that is something quite different.)

              Comment

              Working...
              X