Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation: Difference in difference on repeated cross section data, ordinal outcome variable.

    Hey everyone, I'm currently working on a project where I'm trying to analyze health effects of the financial crisis. I'm using repeated cross sections from the European social survey which is a survey conducted every second year. Data being used is one edition from 2006 and one edition from 2010, before and "after" the crisis. Dependent variable is subjective health, which is an ordinal variable taking value 1 to 5 in the following order: "Very good", "good", "Fair", "bad" and "very bad". Control variables are gender, years of education(harmonized across the European countries), age of respondent and whether a person is born in the country or not. I'm using Germany and Austria as the control group, and Spain, Ireland and Greece as the treatment group. The ESS dataset includes different weights to control for differences in population size etc.

    My main focus is to compare the subjective health of respondents from the treatment group with the control group, and compare the subjective health of different socioeconomic groups from the treated and control group.


    A description of variables used follows under:

    Click image for larger version

Name:	Skjermbilde 2019-01-29 kl. 12.21.10.png
Views:	2
Size:	197.9 KB
ID:	1481017




    So far I've run a difference in difference analysis using the following approach:

    1: Generated dummy variable for treatment and control group and dummy for periods before and after crisis
    Code:
    gen treated = (cntry == "ES") | (cntry == "IE") | (cntry == "GR")
    gen ess_3_5 = 0 if essround == 3
    replace ess_3_5 = 1 if essround == 5
    2: Diff in diff

    Code:
    diff health, t(treated) p(ess_3_5) cov ( dweight pweight brncntr gndr agea edulvla)
    Yielding the following Stata output:

    Click image for larger version

Name:	Skjermbilde 2019-01-29 kl. 12.34.52.png
Views:	2
Size:	153.3 KB
ID:	1481018


    I have also run a regression where I generated a new variable for health and age of respondent, grouping the health variable into two categories and the age of respondent variable into several age brackets:

    Code:
    gen health_binary=0 if (health==4) | (health==5)
    replace health_binary=1 if (health==1)| (health==2) | (health==3)
    Code:
    gen agea_grouped=0 if agea<18
    replace agea_grouped=1 if agea>=18 & agea<=24
    replace agea_grouped=2 if agea>24 & agea<=34
    replace agea_grouped=3 if agea>34 & agea<=44
    replace agea_grouped=4 if agea>44 & agea<=54
    replace agea_grouped=5 if agea>54 & agea<=64
    replace agea_grouped=6 if agea>64 & agea<=74
    replace agea_grouped=7 if agea>=75
    Running the diff command in Stata
    Code:
    diff health_binary, t(treated) p(ess_3_5) cov ( dweight pweight brncntr gndr agea_grouped edulvla)
    with the new variables gave the following Stata output:

    Click image for larger version

Name:	Skjermbilde 2019-01-29 kl. 12.45.57.png
Views:	2
Size:	151.6 KB
ID:	1481019


    So my questions are as follows: How to interpret these results? I have also tried to do some research on using an ordered probit model with diff in diff, but haven't been able to find anything fitting to my case in this forum.

    This is my first post on this forum, so apologies in advance if my posting is hard to understand. Any feedback would be appreciated, and if something was unclear please ask and I will try to provide additional information.
Working...
X