Hello I want to estimate residential natural gas demand using individual level data by combining geographical instrument and DID approach.
My dataset is billing information of residential natural gas consumption for each Household (gasometer to be specific), spanning several years. The starting and ending date, amount of gas consumed,and total cost for each bill (in each period) is available.
I want to estimate price elasticity of demand but since there is "Increasing Block Rate" structure for residential natural gas, I cannot regress gas consumption on average price to identify it - there is reverse causality by construction, as consumption goes up marginal price increases. Therefore I decided to use an instrument to find variations in average price that is not correlated with consumption. That's why I use Region because authorities set the price scheme (IBR) for each region (there is no market for gas, so price is determined by govt rather than demand and supply). Variations of price in each region induced by authorities is correlated with average price for each household (relevance) and supposedly is not correlated with household level consumption (exclusion). The main concern is that exclusion restriction is violated because authorities don't randomly set the price structure but the region characteristics is considered when setting prices. For instance, they set lower marginal prices for colder areas. I was wondering if I could address this problem by controlling related channels as many as possible?
So far, the regression should look like this :
But why DID ? the price schedule for each region is constant until a certain year and increases after that but this increase is not the same for all regions. Some regions increase moderately and some increased more than that. Although there is not a binary treatment, there is a continuous treatment (say, the price in regions 1 and 2 increased 10% and in regions 3, 4 increased 20%). Therefore I tried to leverage variation of price bothe over time and across regions. In this case I can have a more rigorous regression (control for Zipcode and omit Region):
Or even more stringent one (replace Zipcode by ID):
I hope I could clarify the design and identification strategy. I highly appreciate any comment from you.
Thanks.
My dataset is billing information of residential natural gas consumption for each Household (gasometer to be specific), spanning several years. The starting and ending date, amount of gas consumed,and total cost for each bill (in each period) is available.
Unique House ID | From Date | To Date | Consumption (cm) | total Cost ($) | Zip Code |
101020 | 01/01/2020 | 30/01/2020 | 100 | 20 | 2233 |
101020 | 01/02/2020 | 30/02/2020 | 70 | 15 | 2233 |
101021 | 01/01/2020 | 30/01/2020 | 80 | 17 | 2123 |
I want to estimate price elasticity of demand but since there is "Increasing Block Rate" structure for residential natural gas, I cannot regress gas consumption on average price to identify it - there is reverse causality by construction, as consumption goes up marginal price increases. Therefore I decided to use an instrument to find variations in average price that is not correlated with consumption. That's why I use Region because authorities set the price scheme (IBR) for each region (there is no market for gas, so price is determined by govt rather than demand and supply). Variations of price in each region induced by authorities is correlated with average price for each household (relevance) and supposedly is not correlated with household level consumption (exclusion). The main concern is that exclusion restriction is violated because authorities don't randomly set the price structure but the region characteristics is considered when setting prices. For instance, they set lower marginal prices for colder areas. I was wondering if I could address this problem by controlling related channels as many as possible?
So far, the regression should look like this :
Code:
*[average]Price = Cost / Consumption ivreg2 Consumption i.Year i.Month Temperature (Price = i.Region) , robust
Code:
ivreg2 Consumption i.Year i.Month i.Zipcode Temperature (Price = i.Region#i.Year) , robust cluster(Zipcode)
Code:
ivreg2 Consumption i.Year i.Month i.ID Temperature (Price = i.Region#i.Year) , robust cluster(ID)
Thanks.