Hi all,
In my own data, I am currently working on the replication of the following article Currie, J., Duque, V., & Garfinkel, I. (2015). The Great Recession and mothers' health. The Economic Journal, 125(588). (which can be found here: http://onlinelibrary.wiley.com/doi/1...12239/abstract)
The Currie et al., paper has the following methodology for their analysis:
They estimate the effect of the local area unemployment rate on mother’s health using two logistic models, one that pools data from years 5 and 9 and controls for a rich set of covariates and year and state fixed-effects, and a second one that accounts for time-invariant mother fixed-effects.
The following equation describes the first model:

Where Yit denotes mother i’s health outcome measured at time t, UR is the average unemployment rate in baseline states over the last year t from the date of interview, X is
a matrix of mother characteristics measured at baseline and as and at are vectors of dummies for baseline state and year respectively.
The baseline state dummies control for any time-invariant state-level factors that are correlated with both state economic conditions and women’s health. The year dummies absorb year specific factors that could affect both the economy and mother’s health; e is the disturbance term. All models are clustered at the baseline state level to account for within-state correlation in the observations. The coefficient of interest is b1. A second logistic model controls for mother-specific fixed-effects later in the article.
I replicate this pooled analysis in my own similar data, which records mothers health, local area unemployment at the area of the mothers electoral division and several relevant control variables as below:
My Pooled OLS:

Where Yit refers to the health outcomes or behaviours at time t for individual i, ai is an individual-specific parameter representing the effect of unobserved individual characteristics, b1(Unemployment Rate)i,t refers to the local area unemployment rate, the b are regression coefficients representing the effects of the observed covariates and eit is an independent error term. x'it is a matrix of individuals characteristics included as controls and ax and ay are a vector of dummies for local area (electoral division) and year. These dummies capture any time or local area specific factors that could affect both local area unemployment and the individuals health.
Models are clustered at the individual local area level (i.e. at the individuals electoral division) to account for within-area correlation in the observations. The coefficient of interest is b1. Changes in health outcomes for individual i at time t are matched with changes in local area unemployment rates.
Similar to the Currie et al., analysis, the data that I am looking at comes from a questionnaire recorded on the same individuals every five years for three waves of data collection. Unlike the Currie et al., piece, where local area data refers to American States, mine refers to electoral divisions, which is where the respondents living location is broken up into areas similar to villages.
My interest is how health changes as local area unemployment changes.
The data can be described in Stata as follows:
As Currie et al., include dummies on the individuals local area I generate numeric electoral division data from the string data on electoral divisions to include electoral division as local area dummies.
The data starts life in wide format so I reshape it from wide to long
Following all this setup the actual regression is as below:
It is worth noting that I include “if gender==0,” as I only want to consider women in my analysis.
I have a few issues with the logit approach above however, and I think these will be clear in my attachment of the output from this regression in my next comment, unfortunately I had to post separate comments due to exceeding the word limit.
In my own data, I am currently working on the replication of the following article Currie, J., Duque, V., & Garfinkel, I. (2015). The Great Recession and mothers' health. The Economic Journal, 125(588). (which can be found here: http://onlinelibrary.wiley.com/doi/1...12239/abstract)
The Currie et al., paper has the following methodology for their analysis:
They estimate the effect of the local area unemployment rate on mother’s health using two logistic models, one that pools data from years 5 and 9 and controls for a rich set of covariates and year and state fixed-effects, and a second one that accounts for time-invariant mother fixed-effects.
The following equation describes the first model:
Where Yit denotes mother i’s health outcome measured at time t, UR is the average unemployment rate in baseline states over the last year t from the date of interview, X is
a matrix of mother characteristics measured at baseline and as and at are vectors of dummies for baseline state and year respectively.
The baseline state dummies control for any time-invariant state-level factors that are correlated with both state economic conditions and women’s health. The year dummies absorb year specific factors that could affect both the economy and mother’s health; e is the disturbance term. All models are clustered at the baseline state level to account for within-state correlation in the observations. The coefficient of interest is b1. A second logistic model controls for mother-specific fixed-effects later in the article.
I replicate this pooled analysis in my own similar data, which records mothers health, local area unemployment at the area of the mothers electoral division and several relevant control variables as below:
My Pooled OLS:
Where Yit refers to the health outcomes or behaviours at time t for individual i, ai is an individual-specific parameter representing the effect of unobserved individual characteristics, b1(Unemployment Rate)i,t refers to the local area unemployment rate, the b are regression coefficients representing the effects of the observed covariates and eit is an independent error term. x'it is a matrix of individuals characteristics included as controls and ax and ay are a vector of dummies for local area (electoral division) and year. These dummies capture any time or local area specific factors that could affect both local area unemployment and the individuals health.
Models are clustered at the individual local area level (i.e. at the individuals electoral division) to account for within-area correlation in the observations. The coefficient of interest is b1. Changes in health outcomes for individual i at time t are matched with changes in local area unemployment rates.
Similar to the Currie et al., analysis, the data that I am looking at comes from a questionnaire recorded on the same individuals every five years for three waves of data collection. Unlike the Currie et al., piece, where local area data refers to American States, mine refers to electoral divisions, which is where the respondents living location is broken up into areas similar to villages.
My interest is how health changes as local area unemployment changes.
The data can be described in Stata as follows:
Code:
storage display value variable name type format label variable label ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- binbmi_overwe~y float %14.0g yr10_bin_bmi_overweight Are you overweight based on your BMI (Binary) psum_unempl~e_y float %36.0g labpsum_unemployedgwave_y10 Is local are unemployment greater in this wave than previous waves (Binary) own_education_y byte %58.0g y0_own_education What is your highest level of education (Categorical) medical_card_y byte %9.0g y10q_medical_card Do you hold a medical card, a means tested form of government provided health insurance (Binary) employment_y byte %56.0g y10_employment What is your employment status (Categorical) maritalstatus_y byte %20.0g yr10_Marital_Status What is your marital status (Categorical) ord_age_y float %9.0g ordered_age_year_10 What is your age group (Categorical) year byte %9.0g What year is it? (Categorical) elec_div_y str209 %209s What electoral division do you live in? (Categorical)
Code:
encode elec_div_y, gen(elec_div_y1)
Code:
reshape long binbmi_overweight_y psum_unemployed_total_gwave_y own_education_y medical_card_y employment_y maritalstatus_y ord_age_y year elec_div_y, i(id) j(year) xtset id year
Code:
logit binbmi_overweight_y i.psum_unemployed_total_gwave_y i.own_education_y i.medical_card_y i.employment_y i.maritalstatus_y i.ord_age_y i.year i.elec_div_y1 if gender==0, cluster (elec_div_y) estimates store pooled1 estimates table pooled1, star stats(N r2 r2_a)
It is worth noting that I include “if gender==0,” as I only want to consider women in my analysis.
I have a few issues with the logit approach above however, and I think these will be clear in my attachment of the output from this regression in my next comment, unfortunately I had to post separate comments due to exceeding the word limit.
Comment