I have problems to apply a ratio estimator of population totals using more than one auxiliary variable in Stata 13.1.
Using one auxiliary variable:
My data contains a sample of observations drawn from the population. An auxiliary variable xi, correlated with yi, is obtained for each unit in the sample. Now I apply the ratio estimator to obtain Y, the population total of yi, where the population total X of the xi (auxiliary variable) is known:
Y^ = (y/x) *X where y, x are the sample totals of yi and xi.
In Stata, I make use of survey data analysis and define pweights as X/x. As you notice, the weights are not the inverse of the inclusion probability (sampling weights) like it would be the case in the Horvitz-Thompson estimator. That way, I can easily compute population totals applying the ratio estimator:
Using two auxiliary variables:
My problem is now using the ratio estimator in the presence of two auxiliary variables, xi and zi, X and Z are the known population totals respectively.
Olkin suggests a multivariate ratio estimator (Olkin (1958): Multivariate ratio estimation for finite population, Biometrica, 45. 154-165, weblink: http://biomet.oxfordjournals.org/con...2/154.full.pdf)
Y^ = W1 * (y/x) *X + W2 (y/z) *Z,
where W1 and W2 are weights to be determined to maximize the precision of Y^ subject to W1+W2=1.
Is there any way to do multivariate ratio estimation in Stata? Surely, it is not allowed to have more than one weight in a survey setting in Stata.
Another way is to use calibration estimators proposed by Deville and Särndal (Calibration estimators in survey sampling, Journal of the American Statistical Association 87 (1992), 376–382, weblink: http://www.stat.unipg.it/~giovanna/d...le_sarndal.pdf) which incorporates the use of auxiliary data. This approach is based on the Horvitz-Thompson estimator, thus using sampling weights defined as the inverse of the inclusion probability. The idea behind is to find weights w1 and w2 close to the sampling weights based on a distance function such that when multiplying sample totals with w1 and w2 the population totals are precisely matched. However, calibration estimation does not seem really appropriate for my case since the weights I use are not sampling weights but rather the ratio X/x (or Z/z). What do you think?
Using one auxiliary variable:
My data contains a sample of observations drawn from the population. An auxiliary variable xi, correlated with yi, is obtained for each unit in the sample. Now I apply the ratio estimator to obtain Y, the population total of yi, where the population total X of the xi (auxiliary variable) is known:
Y^ = (y/x) *X where y, x are the sample totals of yi and xi.
In Stata, I make use of survey data analysis and define pweights as X/x. As you notice, the weights are not the inverse of the inclusion probability (sampling weights) like it would be the case in the Horvitz-Thompson estimator. That way, I can easily compute population totals applying the ratio estimator:
Code:
svyset [pweight = weight] svy: total y
My problem is now using the ratio estimator in the presence of two auxiliary variables, xi and zi, X and Z are the known population totals respectively.
Olkin suggests a multivariate ratio estimator (Olkin (1958): Multivariate ratio estimation for finite population, Biometrica, 45. 154-165, weblink: http://biomet.oxfordjournals.org/con...2/154.full.pdf)
Y^ = W1 * (y/x) *X + W2 (y/z) *Z,
where W1 and W2 are weights to be determined to maximize the precision of Y^ subject to W1+W2=1.
Is there any way to do multivariate ratio estimation in Stata? Surely, it is not allowed to have more than one weight in a survey setting in Stata.
Another way is to use calibration estimators proposed by Deville and Särndal (Calibration estimators in survey sampling, Journal of the American Statistical Association 87 (1992), 376–382, weblink: http://www.stat.unipg.it/~giovanna/d...le_sarndal.pdf) which incorporates the use of auxiliary data. This approach is based on the Horvitz-Thompson estimator, thus using sampling weights defined as the inverse of the inclusion probability. The idea behind is to find weights w1 and w2 close to the sampling weights based on a distance function such that when multiplying sample totals with w1 and w2 the population totals are precisely matched. However, calibration estimation does not seem really appropriate for my case since the weights I use are not sampling weights but rather the ratio X/x (or Z/z). What do you think?

Comment