I know this is a rather simple question but I could not find a solution to it. Suppose I have a panel dataset with X firms which are located in Y countries, where X>Y. The dataset contains a variable "GDP" which gives the GDP in year t of the firm's home country. Now, I want to create the median of GDP in a particular year. So for example if Y=5 and the GDP realizations in year t are {5, 10, 15, 20, 25}, then I want STATA to compute the median as 15. Of course, egen median=median(GDP), by(year) is what comes to my mind immediately, but the problem here is that if not every country hosts the same number of firms, then the median computed this way will be influenced by that; e.g. if I have 100 firms in the country that has GDP=25 and one firm in each other country, then the computed median will equal 25, while I want STATA to tell me the median is 15 also in this case.
Of course, one solution is to use "duplicates drop country year, force", then compute the median with the egen command, save the file with a different name and merge it to the original file. But I'm sure there is a much easier way to do that, and hope that someone has a good idea.
Of course, one solution is to use "duplicates drop country year, force", then compute the median with the egen command, save the file with a different name and merge it to the original file. But I'm sure there is a much easier way to do that, and hope that someone has a good idea.
Comment