Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ideas for ratio estimation (X and Y from different surveys)?

    We are trying to estimate the rate of event X per hour exposed to certain activities (e.g. commuting) by combining statistics from two multi-year cross-sectional surveys, one for the denominator and the other for the numerator. However we don’t really know how to do this and are reaching out for any ideas.

    The first question is, is it possible to estimate MEAN(X/Y) and VAR(X/Y), where X indicates cases and Y indicates hours of exposure? The typical ways to estimate ratio (e.g. delta method) may not work here, because X and Y are from two different surveys. We noticed that an old thread (http://www.stata.com/statalist/archi.../msg00723.html) suggests that theoretically it is possible to estimate MEAN( X /Y) even though X and Y are from different surveys. Would doing so be exceedingly complicated (e.g. re-weighting the X, re-weighting Y…etc.)?

    So we came up with plan B: what about just estimate MEAN(X)/MEAN(Y), and hence VAR(MEAN(X)/MEAN(Y))? Certainly, by doing so we are not trying to calculate the average of individual rates, but simply the ratio between the total events divided by total exposure hours, a commonly-used method in epidemiological studies in estimating risk at the population level.

    Then main question is, under plan B how do we calculate VAR(MEAN(X)/MEAN(Y))? Also, is estimating MEAN(X)/MEAN(Y) as straightforward as we thought, i.e. just substitute MEAN(X) with the sample mean, and do the same thing for MEAN(Y)?

    Thank you kindly

    Tin-chi Lin

    Research Scientist
    Center for Injury Epidemiology
    Liberty Mutual Research Institute for Safety
    71 Frankland Road
    Hopkinton, MA 01748

  • #2
    Is this basically the same as the issues raised in this thread? http://www.statalist.org/forums/foru...erent-datasets
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Yes indeed...although back to then our concern was primarily about MEAN(Y/X). So the solution provided by Steven Samuel's in the thread (thank you!), which is about MEAN(Y)/MEAN(X), has not captured our full attention until recently when we realized that MEAN(Y)/MEAN(X) may be another reasonable statistic to study too. Thank you for pointing this out.

      Comment


      • #4


        Your target parameter is \(R =T_y/T_x\) , where T's refer to totals in the population . The question is, how well does the ratio of sample totals estimate this ratio. This question is well-studied in the sampling literature. See, e.g. Lohr (2009) and Cochran (1977). The answer: the ratio of sample totals is biased as an estimate of population totals, but if X is estimated with sufficient accuracy, the bias is small compared to sampling error, The formula I gave is the standard delta method approximation to the standard error of the log ratio, under the assumption of zero correlation between Y and X- true here because the surveys were independent.

        Reference:

        Cochran, WG. 1954. Some methods for strengthening the common chi square tests. Biometrics 10, no. 4: 417-451.
        Lohr, Sharon L. 2009. Sampling: Design and Analysis. Boston, MA: Cengage Brooks/Cole.
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment

        Working...
        X