Interpretation: Difference in difference on repeated cross section data, ordinal outcome variable.

Stian Hansen

Join Date: Jan 2019

Posts: 8
#1

Interpretation: Difference in difference on repeated cross section data, ordinal outcome variable.

29 Jan 2019, 05:07

Hey everyone, I'm currently working on a project where I'm trying to analyze health effects of the financial crisis. I'm using repeated cross sections from the European social survey which is a survey conducted every second year. Data being used is one edition from 2006 and one edition from 2010, before and "after" the crisis. Dependent variable is subjective health, which is an ordinal variable taking value 1 to 5 in the following order: "Very good", "good", "Fair", "bad" and "very bad". Control variables are gender, years of education(harmonized across the European countries), age of respondent and whether a person is born in the country or not. I'm using Germany and Austria as the control group, and Spain, Ireland and Greece as the treatment group. The ESS dataset includes different weights to control for differences in population size etc.

My main focus is to compare the subjective health of respondents from the treatment group with the control group, and compare the subjective health of different socioeconomic groups from the treated and control group.

A description of variables used follows under:

So far I've run a difference in difference analysis using the following approach:

1: Generated dummy variable for treatment and control group and dummy for periods before and after crisis

Code:

gen treated = (cntry == "ES") | (cntry == "IE") | (cntry == "GR") gen ess_2 = 0 if essround == 2 replace ess_2 = 1 if essround == 5

2: Diff in diff

Code:

diff health, t(treated) p(ess_3_5) cov ( dweight pweight brncntr gndr agea edulvla)

Yielding the following Stata output:

I have also run a regression where I generated a new variable for health and age of respondent, grouping the health variable into two categories and the age of respondent variable into several age brackets:

Code:

gen health_binary=0 if (health==4) | (health==5) replace health_binary=1 if (health==1)| (health==2) | (health==3)

Code:

gen agea_grouped=0 if agea<18 replace agea_grouped=1 if agea>=18 & agea<=24 replace agea_grouped=2 if agea>24 & agea<=34 replace agea_grouped=3 if agea>34 & agea<=44 replace agea_grouped=4 if agea>44 & agea<=54 replace agea_grouped=5 if agea>54 & agea<=64 replace agea_grouped=6 if agea>64 & agea<=74 replace agea_grouped=7 if agea>=75

Running the diff command in Stata:

Code:

diff health_binary, t(treated) p(ess_3_5) cov ( dweight pweight brncntr gndr agea_grouped edulvla)

with the new variables gave the following Stata output:

So my questions are as follows: How to interpret these results? I have also tried to do some research on using an ordered probit model with diff in diff, but haven't been able to find anything fitting to my case in this forum.

This is my first post on this forum, so apologies in advance if my posting is hard to understand. Any feedback would be appreciated, and if something was unclear please ask and I will try to provide additional information.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#2

29 Jan 2019, 10:40

I don't think your results are interpretable. I don't think this approach is adequate. You are dealing with a complex survey design but you have not incorporated that into your analysis. Using the variables dweight and pweight as covariates is not the correct way to account for the survey design. You will need to -svyset- your data in some appropriate way and run your analysis under the -svy:- prefix. I cannot advise you on how to -svyset- this data as I am not familiar with the design of the European Social Survey. It is likely that the documentation that is provided with the data explains the variables that identify strata, primary and higher order sampling units, and sampling weights: you will need to know that in order to figure out how to -svyset- the data. (Or it may be that somebody active on the Forum is already familiar with the ESS and can just explain it to you in this thread.)

I do not think the -diff- command works with the -svy:- prefix. So you will not be able to use it with this data unless the design involves no stratification or primary sampling units and just has sampling probability weights (-diff- does allow sampling probability weighting; see -help diff-.)

Once you have figured out your -svyset-, the analysis will just be

Code:

svy: ologit health i.treated##i.ess_3_5 // POSSIBLY SOME COVARIATES

Note: There is a bit of confusion in your code. You create a variable ess_2 to distinguish the two survey rounds, but then in your -diff- commands you use a different variables ess_3_5. What's that about? When you get to running -svy: ologit- use whichever of those is appropriate where I have put ess_3_5.

By the way, in the future, when showing Stata output, don't post a screenshot. Yours happened to come out fine, but they are often unreadable. The best way to show Stata output is to put it between code delmiters, just as you would (did) with code.
Comment
Doug Hassanali

Join Date: Sep 2018

Posts: 14
#3

08 Feb 2021, 23:58

Hello Clyde, I have a question about treatment and control groups - DiD using repeated cross sectional data.
If no control group exists (due to lack of data) in pre-treatment period for an age specific cohort. Is it possible to construct a control group made up of the same age individuals from the same region but in a different time period i.e. post treatment (5 years after treatment)? Treatment = exposure to conflict thus control group will not have been exposed and maybe there are no spillover effects. Robustness checks?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#4

09 Feb 2021, 09:14

It is possible, but it doesn't sound like a good idea to me. The use of a group that is post-treatment as pre-treatment controls relies on two very strong assumptions: the treatment effect has completely worn off after 5 years, and there is no secular trend in the outcome, and there are no cohort effects on the outcome. If you have independent evidence (i.e. evidence not from the data in your study) that these are all true, then I suppose you could make the case. Personally, I'd be skeptical.
Comment
Doug Hassanali

Join Date: Sep 2018

Posts: 14
#5

09 Feb 2021, 17:07

Thanks for the feedback. Other than DiD, what other approach/methodology do you think would work best in this case?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#6

09 Feb 2021, 17:11

I can't answer that without more information about your question and the context. What are the key variables you are trying to relate. What is the setting and population? What is the treatment in question? It would be particularly helpful to also know why there is this gap in the availability of controls.
Comment
Doug Hassanali

Join Date: Sep 2018

Posts: 14
#7

11 Feb 2021, 01:13

Hello, thanks for the feedback. I am examining short term and long term impacts of exposure to violence on human capital formation. Variables are mental health, labour and educational attainment. Setting - regional in areas prone to violence.
Treatment is exposure to violence (distance) and as a result I have affected and non affected individuals. The violent conflict in question is over 20 years with varying degrees of intensity and due to lack of data it is not possible for pre-treatment. I can only track geolocated individuals from 2000 - 2006 (end) but conflict episodes start much earlier in 1989.
Your advice is much appreciated, Thanks.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#8

11 Feb 2021, 11:20

Well, that sounds like a pretty difficult situation to deal with. I don't have any good ideas. You might get more helpful advice from other people who study conflict. If you are working in an institutional setting, perhaps colleagues in your department can help you. Or perhaps there are Forums like that focus on your discipline instead of on Stata/statistics. Or you might do a literature search on this topic and see what others faced with a simliar predicament have done, or even contact authors of previous studies for advice.

Sorry I can't be more helpful on this.
Comment
Doug Hassanali

Join Date: Sep 2018

Posts: 14
#9

11 Feb 2021, 20:56

Thanks for the feedback
Comment

Announcement

Interpretation: Difference in difference on repeated cross section data, ordinal outcome variable.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment