Dear Statalist users,
I have tried multiple methods to try the following:
- I have a database, where I need to check if firms have at least one counterpart 0 and 1 in each year, industry, and decile. If they do not have a counterpart, I have to drop those firms.
- In my database, I already created a dummy variable called group where it would show 1 for one group and 2 for the other group.
- I then proceeded to check the minimum and maximum of the group dummy by year, industry and decile with the following code:
bysort year industry decile: egen max = max(group)
bysort year industry decile: egen min = min(group)
- And then generated the dummy that would show if min and max were equal. If they were equal, it would mean they would have no counterpart and thus be dropped.
generate check = (max==min)
-As this would only drop firmyear observations, and not the whole firm, I made a new dummy that would be equal to 1 for the whole firm if one of the firm-year observations was 1 for the checkdummy.
bys firm (year), sort: gen secondcheck = check[1] !=check[_N]
drop if secondcheck==1
Now another method that I tried is the following:
egen test=group(year industry decile), label
sort test
by test: egen max=max(group)
by test: egen min=min(group)
by test: generate check = (max==min)
And same for above then again to drop the whole firm instead of just the firmyear observation:
bys firm (year), sort: gen secondcheck = check[1] !=check[_N]
drop if secondcheck==1
Both methods would give me slightly different results, and I am not sure how to check which one performed correctly.
Can any of you experts give your opinion on this?
Thank you so much in advance!
I have tried multiple methods to try the following:
- I have a database, where I need to check if firms have at least one counterpart 0 and 1 in each year, industry, and decile. If they do not have a counterpart, I have to drop those firms.
- In my database, I already created a dummy variable called group where it would show 1 for one group and 2 for the other group.
- I then proceeded to check the minimum and maximum of the group dummy by year, industry and decile with the following code:
bysort year industry decile: egen max = max(group)
bysort year industry decile: egen min = min(group)
- And then generated the dummy that would show if min and max were equal. If they were equal, it would mean they would have no counterpart and thus be dropped.
generate check = (max==min)
-As this would only drop firmyear observations, and not the whole firm, I made a new dummy that would be equal to 1 for the whole firm if one of the firm-year observations was 1 for the checkdummy.
bys firm (year), sort: gen secondcheck = check[1] !=check[_N]
drop if secondcheck==1
Now another method that I tried is the following:
egen test=group(year industry decile), label
sort test
by test: egen max=max(group)
by test: egen min=min(group)
by test: generate check = (max==min)
And same for above then again to drop the whole firm instead of just the firmyear observation:
bys firm (year), sort: gen secondcheck = check[1] !=check[_N]
drop if secondcheck==1
Both methods would give me slightly different results, and I am not sure how to check which one performed correctly.
Can any of you experts give your opinion on this?
Thank you so much in advance!
Comment