Instrumented difference-in-difference (DID)

Mahdi Jafari

Join Date: Jul 2024

Posts: 15
#1

Instrumented difference-in-difference (DID)

07 Jul 2024, 10:23

Hi,
I want to estimate price elasticity of gas consumption by implementing an instrumented DID. the price changes over time for different regions and are affected by authorities in each region.
Therefore I use Region as my instrument that induce variation in price of gas. But there is some concern that Region affects consumption of gas through variables other than price such as temperature (violating exclusion restriction).
Does including potential variables in the regression address exclusion restriction ? like controlling temperature.
In other words does the following 2sls regression estimate UNBIASED coefficients? (Assuming the only channel that Region might affect the consumption of gas other than price is temperature)

Code:

ivreg2 GasConsumption i.year i.ZipCode Temp (Price = i.Region i.Region#i.year), robust cluster(ZipCode)
Tags: None
Ali Bahrami Sani

Join Date: Jul 2024

Posts: 22
#2

07 Jul 2024, 21:17

Originally posted by Mahdi Jafari View Post

Hi,
I want to estimate price elasticity of gas consumption by implementing an instrumented DID. the price changes over time for different regions and are affected by authorities in each region.
Therefore I use Region as my instrument that induce variation in price of gas. But there is some concern that Region affects consumption of gas through variables other than price such as temperature (violating exclusion restriction).
Does including potential variables in the regression address exclusion restriction ? like controlling temperature.
In other words does the following 2sls regression estimate UNBIASED coefficients? (Assuming the only channel that Region might affect the consumption of gas other than price is temperature)

Code:

ivreg2 GasConsumption i.year i.ZipCode Temp (Price = i.Region i.Region#i.year), robust cluster(ZipCode)

Hi Mahdi,

I assume these two notes will be a good hand:
de Chaisemartin, C. (2010). A note on instrumented difference in differences. Unpublished Manuscript, University of Warwick.

Hudson, S., Hull, P., and Liebersohn, J. (2017). Interpreting instrumented difference-in-differences Metrics Note, Sept.
1 like
Comment
Mahdi Jafari

Join Date: Jul 2024

Posts: 15
#3

08 Jul 2024, 02:15

Originally posted by Ali Bahrami Sani View Post

de Chaisemartin, C. (2010). A note on instrumented difference in differences. Unpublished Manuscript, University of Warwick.

Thank you, I'll look into it. But Can you provide a link to the first paper? I suppose it's been published under the title Fuzzy difference-in-differences.
1 like
Comment
Ali Bahrami Sani

Join Date: Jul 2024

Posts: 22
#4

09 Jul 2024, 03:36

Originally posted by Mahdi Jafari View Post

Thank you, I'll look into it. But Can you provide a link to the first paper? I suppose it's been published under the title Fuzzy difference-in-differences.

A_note_on_instrumented_difference_in_differences (2).pdf - Google Drive
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2151
#5

09 Jul 2024, 06:16

I'm curious as to why you're calling this "difference-in-differences." This looks like a gasoline demand function using zip code-level panel data. And you're arguing that region can be used as IVs. Including zip code FEs means that region cannot be used alone as IVs because those dummies are defined at a coarser level than zip code, so they'll get swept away by i.zipcode. Mechanically, your method works because you've included i.region#i.year as IVs, but I suspect many will not buy this identification strategy.

In any case, there is no before-after "treatment" periods, are there? DiD, to me, implies that there's an intervention where a treatment is zero prior to some intervention. I don't see that here, so I'm not sure calling it Instrumented DiD is helpful here.
2 likes
Comment
Mahdi Jafari

Join Date: Jul 2024

Posts: 15
#6

18 Jul 2024, 02:22

Originally posted by Jeff Wooldridge View Post

Including zip code FEs means that region cannot be used alone as IVs

Thank you very much. Yes you are right about collinearity between region and zipcode, I realized just after I posted it.
Anyways, let me explain the context and why I call it a DID.
My dataset is billing information of residential natural gas (not gosoline) consumption for each Household (gasometer to be specific), spanning several years. The starting and ending date, amount of gas consumed,and total cost for each bill (in each period) is available.

Unique House ID From Date To Date Consumption (cm) total Cost ($) Zip Code

101020 01/01/2020 30/01/2020 100 20 2233

101020 01/02/2020 30/02/2020 70 15 2233

101021 01/01/2020 30/01/2020 80 17 2123

I want to estimate price elasticity of demand but since there is "Increasing Block Rate" structure for residential natural gas, I cannot regress gas consumption on average price to identify it - there is reverse causality by construction, as consumption goes up marginal price increases. Therefore I decided to use an instrument to find variations in average price that is not correlated with consumption. That's why I use Region because authorities set the price scheme (IBR) for each region (there is no market for gas, so price is determined by govt rather than demand and supply). Variations of price in each region induced by authorities is correlated with average price for each household (relevance) and supposedly is not correlated with household level consumption (exclusion). The main concern is that exclusion restriction is violated because authorities don't randomly set the price structure but the region characteristics is considered when setting prices. For instance, they set lower marginal prices for colder areas. I was wondering if I could address this problem by controlling related channels as many as possible?
So far, the regression should look like this :

Code:

*[average]Price = Cost / Consumption ivreg2 Consumption i.Year i.Month Temperature (Price = i.Region) , robust

But why DID ? the price schedule for each region is constant until a certain year and increases after that but this increase is not the same for all regions. Some regions increase moderately and some increased more than that. Although there is not a binary treatment, there is a continuous treatment (say, the price in regions 1 and 2 increased 10% and in regions 3, 4 increased 20%). Therefore I tried to leverage variation of price bothe over time and across regions. In this case I can have a more rigorous regression (control for Zipcode and omit Region):

Code:

ivreg2 Consumption i.Year i.Month i.Zipcode Temperature (Price = i.Region#i.Year) , robust cluster(Zipcode)

Or even more stringent one (replace Zipcode by ID):

Code:

ivreg2 Consumption i.Year i.Month i.ID Temperature (Price = i.Region#i.Year) , robust cluster(ID)

I hope I could clarify the design and identification strategy. I highly appreciate any comment from you.
Thanks.

Last edited by Mahdi Jafari; 18 Jul 2024, 02:29.
Comment
Mahdi Jafari

Join Date: Jul 2024

Posts: 15
#7

22 Jul 2024, 01:31

Dear Nick Cox , Clyde Schechter , William Lisowski , Maarten Buis and other dears,
Could you please help me with that ?

Last edited by Mahdi Jafari; 22 Jul 2024, 01:36.
Comment
Mahdi Jafari

Join Date: Jul 2024

Posts: 15
#8

11 Aug 2024, 00:02

Originally posted by Mahdi Jafari View Post

Dear Nick Cox , Clyde Schechter , William Lisowski , Maarten Buis and other dears,
Could you please help me with that ?

One reason that people post hundreds of new threads for the repetitive questions is that old ones never get answered, while new post always is replied within a minute!
So moderators, Don't ever complain about duplicate posts!
Comment

Unique House ID	From Date	To Date	Consumption (cm)	total Cost ($)	Zip Code
101020	01/01/2020	30/01/2020	100	20	2233
101020	01/02/2020	30/02/2020	70	15	2233
101021	01/01/2020	30/01/2020	80	17	2123

Announcement

Instrumented difference-in-difference (DID)

Comment

Comment

Comment

Comment

Comment

Comment

Comment