Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice on regression: is panel data needed?

    Hello,

    I am performing multilinear regressions to see if there is a relationship between a country's happiness index (y) and environmental performance indicators (xj).

    The second stage of my analysis is to see whether change in an environmental performance indicator better predicts happiness than the absolute value of the indicator itself. Rationale being seeing your environment improve/deteriorate can impact wellbeing / hope for a better future.

    I am not sure whether I need to use a panel dataset or not. For the environmental performance indices the dataset automatically gives the change in the last 10 years as a value - thus it is not panel. Panel data of the indices for the last 10 years can however be obtained.

    For the happiness indices I only have data from 2015. It does go back further but I cannot seem to access. So, can I use the 2022 happiness indicator in my regression, or do I need to use a panel dataset of happiness alongside a panel of environmental performance? I could also do the 2022 score - 2015 score to find out change in last seven years?

    It might be useful to know that the World Happiness Report, from where I have my data, states that happiness shows great robustness over time and was not sig. impacted by Covid. Hence why I am tempted to use 2022.


    Thanks in advance for any further questions or advice!

    Best,
    Nicole

  • #2
    Nicole:
    welcome to this forum.
    Let's try to make things simpler:
    1) if the same sample of units is repeatedly measured at equally spaced time interval (say, once per year along a 10 year time horizon) on the same set of variables, you have a panel dataset;
    2) if different units are repeatedly measured at equally spaced time interval on the same set of variables, you have a survey or a repeate cross-sectional study.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks Carlo! In that case, Environmental Performance Indicators are panel, but happiness indices are not. They are based on the Gallup surveys which sample different people from a country each year.

      Does this mean I am okay to perform a regression using 2022 happiness indice as my Y, and 10year change in environmental performance indicator's as my X?

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Nicole:
        welcome to this forum.
        Let's try to make things simpler:
        1) if the same sample of units is repeatedly measured at equally spaced time interval (say, once per year along a 10 year time horizon) on the same set of variables, you have a panel dataset;
        2) if different units are repeatedly measured at equally spaced time interval on the same set of variables, you have a survey or a repeate cross-sectional study.

        Actually, on reflection I think happiness data is panel. It comes from a sample, but units are the same - the countries are consistently measured and reported. So, I think I need to construct panel data that looks a bit like this (example only):
        Country Year Happiness scr Deforestation Air pollution
        Australia 2015 7 6 3
        Australia 2016 6 8 0
        Australia 2017 9 7 0
        Colombia 2015 8 9 4
        Colombia 2016 8 5 7

        Then run multilinear regressions with diff controls e.g. GDP, country FE, time FE etc. And being sure to control for baseline EPI - by including the 2022 overall environmental performance index for all indicators for each country. Does this sound about right?

        Nicole

        Comment


        • #5
          Nicole:
          if -country- is your -panelid- you cannot include among the predictors. whereas you can do that for -i.year-.
          I'd start off with -xtreg,fe- (as your regressand seems continuous) and then compare -fe- with -re- results.
          As usual, the -fe- estimator wipes out time-invariant variables.
          I'm not clear with the role of -2022 overall environment- if you do not have data about the same regressor for the previous years.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment

          Working...
          X