Dear Statalists,
I am asking for advice how, in general, I would want to approach the following problem.
I have a dyadic data set with country pairs like in the data example below. Var1 and var2 contain information for country1 and country2, respectively. However, for some countries the data is missing. Fortunately, I have a number of standardized distance measures between all countries (which is why the data here is presented in dyad form). Think of geographic distance or distance based on a vector of characteristics that the countries share or do not share.
In the end, I would like to interpolate the missing values (here missing for country "02") based on the existing data for var1 and var2 for the other countries ("01", "03", "04") weighted by the distance in each country pair ("0102", "0203" etc.). Ideally, the interpolation would make use of all three distance measures, assigning equal weights to each of them.
Does anyone here have some experience with how to implement such interpolation ideas in pairs along more than one dimension? Are there any prewritten commands that could help me? I understand that mipolate would not be adequate for interpolation along multiple dimensions and/or in pairs. Please correct me, if I am wrong here.
As always, thank you very much for your insights.
Best,
Milan
I am asking for advice how, in general, I would want to approach the following problem.
I have a dyadic data set with country pairs like in the data example below. Var1 and var2 contain information for country1 and country2, respectively. However, for some countries the data is missing. Fortunately, I have a number of standardized distance measures between all countries (which is why the data here is presented in dyad form). Think of geographic distance or distance based on a vector of characteristics that the countries share or do not share.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str4 pair str2(country1 country2) float(var1 var2 dist1 dist2 dist3) "0102" "01" "02" 3 . .75 -.3 0 "0103" "01" "03" 3 4 .2 .8 1 "0104" "01" "04" 3 7 .3 .4 .5 "0203" "02" "03" . 4 -.23 -.9 .3 "0204" "02" "04" . 7 -.2 -.2 -.4 "0304" "03" "04" 4 7 .5 -.12 .5 end
In the end, I would like to interpolate the missing values (here missing for country "02") based on the existing data for var1 and var2 for the other countries ("01", "03", "04") weighted by the distance in each country pair ("0102", "0203" etc.). Ideally, the interpolation would make use of all three distance measures, assigning equal weights to each of them.
Does anyone here have some experience with how to implement such interpolation ideas in pairs along more than one dimension? Are there any prewritten commands that could help me? I understand that mipolate would not be adequate for interpolation along multiple dimensions and/or in pairs. Please correct me, if I am wrong here.
As always, thank you very much for your insights.
Best,
Milan
Comment