Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to deal with pooled cross-section data?

    Dear all,

    I wish to examine life satisfaction at the individual level for a specific country.

    The data I will use is a pooled cross-section from the Gallup World Poll covering the years 2006 -2020. The poll makes use of the subjective measures of life satisfaction. I will be using the “Life Today” value; the value is from 0 to 10.

    The demographics I have are Aggregate, Gender, Age, and Marital Status. but I have each one of them for separate Life Today value.

    The problem with my data is the number of observations (which are the individuals who took the survey (N size)) differs from year to year. Also, it varies from each demographic.

    My question is:

    1- I ran regress LifeToday and then I notice that the number of observations is 15 (which is the Years) and I want Stata to know that the (N-Size) is the actual number of observations. How? I ran svyset and nothing changed.


    I am new to Stata and econometrics, so please mind my ignorant (:


    Here is an example of my data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str5 Year float LifeToday int Nsize
    "2020" 6.6 1038
    "2019" 6.6 1041
    "2018" 6.4 1001
    "2017" 6.3  988
    "2016" 6.5  987
    "2015" 6.3 1008
    "2014" 6.3 2008
    "2013" 6.5  990
    "2012" 6.5 2111
    "2011" 6.7 2016
    "2010" 6.3 2030
    "2009" 6.1 2044
    "2008" 6.8 1145
    "2007" 7.3  983
    "2006" 7.1  994
    end



  • #2
    Rawan Alrawaf the unit of analysis in the dataset you provided using dataex is the year. Do you have a dataset where each row is an individual?

    Comment


    • #3
      Tom Scott No, unfortunately, I don't have access to the respondent level dataset.
      Last edited by Rawan Alrawaf; 24 Nov 2020, 20:37.

      Comment


      • #4
        Well then you're stuck to analyzing change in LifeToday at the aggregated sample level. You can't do anything with data that you don't have

        Comment


        • #5
          Thank you for your response.

          So when I ran reg LifeToday at the aggregate sample level, I got F(0,14)= 0
          does this mean that I did something wrong or just because I have a small number of observations?

          Comment


          • #6
            You said that your data are cross-section, but it seems you work only with time-series. Cross-section means that you work with different units (for example different countries, individuals within country etc.) at the one point or period of time. If you have information about different countries AND in different time, then you have cross-section time-series data (panel data).

            Also, it is important to define your level of unit in your analysis. If you are interested in individual level, but you have agregated data at the country level only, it is impossible (or at least very limited without strong assumptions) to say something about individual level.
            Last edited by Karel Novak; 25 Nov 2020, 00:55.

            Comment

            Working...
            X