Hello,
I'm working on a cross country studies with various number of sample from each country for my thesis. My model is the following:
Y = a + b*X + c*Z + d*I + e
(panel data)
where Y is individual level (company), X is country level variable, Z is control variables in country level, and I is control variables in individual level.
In my sample, one of the countries accounts for 50% of the sample. I have 57 countries. I think this will be a problem because the country which has the most observation will have more weight and makes countries with small observation becoming less important. The result could be driven by the country that has the most observation.
I have tried to find the solution and found one paper in a high impact journal (Journal of Finance) that has the same issue as mine. The authors use weighted regression by weighing the individual variables by the inverse number of the observation from each country. Then I found the weighted regression in the stata help (pweight, aweight, fweight) and some discussions related about it.
However, I still do not know the technical aspect about these weighting options and I have not found the solution from the discussions I mentioned before.Therefore, I would like to ask which kind of weight option is appropriate for my case. I am a novice Stata user. I hope someone could kindly guide me on this.
Thank you in advance
Leo
I'm working on a cross country studies with various number of sample from each country for my thesis. My model is the following:
Y = a + b*X + c*Z + d*I + e
(panel data)
where Y is individual level (company), X is country level variable, Z is control variables in country level, and I is control variables in individual level.
In my sample, one of the countries accounts for 50% of the sample. I have 57 countries. I think this will be a problem because the country which has the most observation will have more weight and makes countries with small observation becoming less important. The result could be driven by the country that has the most observation.
I have tried to find the solution and found one paper in a high impact journal (Journal of Finance) that has the same issue as mine. The authors use weighted regression by weighing the individual variables by the inverse number of the observation from each country. Then I found the weighted regression in the stata help (pweight, aweight, fweight) and some discussions related about it.
However, I still do not know the technical aspect about these weighting options and I have not found the solution from the discussions I mentioned before.Therefore, I would like to ask which kind of weight option is appropriate for my case. I am a novice Stata user. I hope someone could kindly guide me on this.
Thank you in advance

Leo
Comment