I am trying to find out, after controlling for industry, country, and year, the effect that internet usage rates have had on exports, and I want to understand how this effect differs according to how technology-intensive the industry is. In particular I want to control for country-year year-industry and industry-country fixed effects.
I have export data for every country, over 5 years broken down by industry (99 industries) - and for each industry I also have a corresponding industry R&D intensity variable (1-4). I also have data on %internet users by country for each year.
Again, trying to control for fixed country, time, and industry effects, and see w
Ycti= (Intensity_i * IT_ct) + FEct + FEit + FEci
reghdfe log_exports i.intensity#c.internet_users, absorb(y_c c_i y_i)
where intensity_i is a dummy of R&D intensity (from 1-4) for each industry I have
where IT_ct is internet usage data for each country and time input I have
and FEs are the fixed effects (fixed country-time, industry-time, country-industry)
The above experiment asked to much of the data, so my professor said I could control for just country, year and time fixed effects, so long as I included IT_ct again in the regression
Ycti= (Intensity_i * IT_ct) + IT_ct + FEc + FEt+ FEi
reghdfe log_exports i.intensity#c.internet_users internet_users, absorb(year country_code industry_code)
So my questions are
a) can anyone explain why this makes sense to add back in internet_users again to the regression? To me it doesn't make sense why i have to re-add it in...
b) I am using reghdfe, but with one of the samples I run (only using data on developed countries), I get that dummy for intensity 1 was omitted because of collinearity, and for another regression (only using data on developed countries) intensity dummy 2 was omitted. Is it because these omitted variables are co-linear with the internet_users variable? or is this just stata using one of them arbitrarily as a dropped dummy against which the other dummies are compared?
bellow is an image of my regression results.

Thank you in advance!
I have export data for every country, over 5 years broken down by industry (99 industries) - and for each industry I also have a corresponding industry R&D intensity variable (1-4). I also have data on %internet users by country for each year.
year | country_code | industry_code | intensity | exports_usd | internet_users |
1998 | 4 | 19 | 2 | 209823 | .15 |
1998 | 4 | 20 | 4 | 23423 | .15 |
1998 | 4 | 21 | 3 | 988474 | .15 |
1998 | 4 | 22 | 2 | 3344 | .15 |
1998 | 4 | 23 | 1 | 134523 | .15 |
1998 | 8 | 19 | 2 | 46578435 | .22 |
1998 | 8 | 20 | 4 | 555675 | .22 |
1998 | 8 | 21 | 3 | 3837 | .22 |
1998 | 8 | 22 | 2 | 863522 | .22 |
1998 | 8 | 23 | 1 | 43355 | .22 |
2002 | 4 | 19 | 2 | 435246 | .18 |
2002 | 4 | 20 | 4 | 445554 | .18 |
Ycti= (Intensity_i * IT_ct) + FEct + FEit + FEci
reghdfe log_exports i.intensity#c.internet_users, absorb(y_c c_i y_i)
where intensity_i is a dummy of R&D intensity (from 1-4) for each industry I have
where IT_ct is internet usage data for each country and time input I have
and FEs are the fixed effects (fixed country-time, industry-time, country-industry)
The above experiment asked to much of the data, so my professor said I could control for just country, year and time fixed effects, so long as I included IT_ct again in the regression
Ycti= (Intensity_i * IT_ct) + IT_ct + FEc + FEt+ FEi
reghdfe log_exports i.intensity#c.internet_users internet_users, absorb(year country_code industry_code)
So my questions are
a) can anyone explain why this makes sense to add back in internet_users again to the regression? To me it doesn't make sense why i have to re-add it in...
b) I am using reghdfe, but with one of the samples I run (only using data on developed countries), I get that dummy for intensity 1 was omitted because of collinearity, and for another regression (only using data on developed countries) intensity dummy 2 was omitted. Is it because these omitted variables are co-linear with the internet_users variable? or is this just stata using one of them arbitrarily as a dropped dummy against which the other dummies are compared?
bellow is an image of my regression results.
Thank you in advance!
Comment