Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to make pseudo panel data from time series of independent cross section data?

    Dear all, actually i find a problem that how to make pseudo panel data from time series of independent cross section data. I have three waves of micro data of multiple indicator cluster survey (MICS). In this data, for MPI (multidimensional poverty index) i have three dimensions (education, health and living standard) with ten indicators ( year of schooling, child school attendance, child mortality, nutrition, electricity, improved sanitation, safe drinking water, flooring, cooking fuel and assets). So, kindly help me out to construct pseudo panel data of such variables.
    I shall be thankful to you.

  • #2

    A panel is nothing more than pooled cross sections. You should first check whether individuals in the first wave are present in the second and third wave. Inevitably, you lose cases even in real panels (referred to as "attrition"). Secondly, you should confirm that the variables you are interested in are in all 3 waves. Once you confirm that there is a sizable number of the same individuals in all 3 waves and the variables are OK, you pool the data (put all observations into 1 dataset). Assuming that the cross sections relate to consecutive calendar years, you call observations relating to the first wave "year 1", the second, "year 2", etc.



    Comment


    • #3
      Tabish: I recommend that you read some of the specialist literature on pseudo-panel data, as that would have provided lots of information about how to proceed. (Google for this.) Moreover, it's difficult to answer your question in any detail, because you have omitted crucial information such as (a) how many years apart the MICS surveys are, and especially (b) what your unit of analysis is intended to be (child? parent?). The standard approach to forming pseudo-panel data involves first forming groups of observations, each characterised e.g. by combination of age/age group and sex (exogenous variables), and then using the cell means as observations in the pseudo-panel. Suppose you have MICS surveys one year apart, then men who aged 25 in the first survey year will be aged 26 in the second wave and 27 in the third. Observe that there are important issues regarding cell size when doing this.

      Comment


      • #4
        Yes, you are quite right Stephen: In this case you obviously are not following the same individuals over time, so you need to identify individuals sharing some common characteristics (e.g., age) and arrange them into cohorts, after which the averages of these cohorts can be treated as observations in the pseudo panel.

        Comment


        • #5
          Hi everbody, I have a data set that covers 17 years of annual survey, what gives more the 1 million observations. I've read Deaton e Verbeek, but I can't manage to run a pseudo-panel on STATA.

          My second problem: My study interest is to regress duration of unemployment on distance from house to Central Business District, but I can't understand how I can do this with a Pseudo-Panel which works with cohorts averages. How can I interpret an avarage distance?

          Thank You.

          Comment


          • #6
            Pseudo panels for that purpose are tricky -- I would be really reluctant to answer the question you're posing without real panel data; here's why. There's a huge amount of heterogeneity in earnings ability (human capital, and anything correlated with earnings) that your pseudo panel cells will not capture. When you regress distance on unemployment duration, heterogeneous treatment effects will emerge and likely bias the estimates.

            Panel data is ideal for this because of person fixed effects, but cross section should be reasonable -- my impression is that it is better to err on the side of that rather than bias from the pseudo panel approach. If you're trying to study local labor markets, Alexander Mas has a good handbook chapter.

            Comment


            • #7
              Thank you Makridis, I think I'll proceed with cross section. I have few instrumental variables that can help me to diminish this bias, and working with a Pseudo Panel should reduce them. I wish to use the Pseudo Panel because I have incomplete spells of unemployment duration, and I found that I could handle with it better working with cohorts (age*period).

              Ps: Which handbook are you talking about? I've search a bit but I didin't find it.

              Comment


              • #8
                My fault about the handbook -- I thought of the wrong author; it's actually Enrico Moretti: http://eml.berkeley.edu/~moretti/handbook.pdf

                My knowledge about pseudo panels is that they are only useful for tracking trends -- not for doing serious estimation. It's especially dangerous when you instrument in pseudopanels because the correlation between the IV and heterogeneity within the grid cells can bias estimates in unknown directions. I see why you want to use it though -- why not the Panel Study of Income Dynamics? You could obtain county level identifiers for the PSID panel and match with your distance data.

                Comment


                • #9
                  It would be amazing using PSID data if I wasn't Brazilian! I'll take a look at some papers tomorrow (1:40am here) and think about something. Thank you a lot.

                  Comment


                  • #10
                    how to make pseudo panel data? what are the commands to make it? kindly share it

                    Comment

                    Working...
                    X