Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Survey domain analysis in regression

    There are plenty of examples of domain analysis for things like frequencies or means, but less for regression. A few questions:

    1) Is it necessary to conduct domain analyses for variables that are part of the sample design? For example, say a sample was stratified on gender but not region and want to test mean differences on some variable by gender and region. Could I get away with a domain analysis on gender alone?

    2) Most regression examples deal with domain variables that are not covariates. Say I want to run the following model: y=gender+region where gender is part of the sample design and region is not. Which covariates, if any, should be included as domain variables?

    3) Let's say I have the following model: y=a+b+c where I'll assume a, b and c are all domain variables. How can I create an appropriate domain analysis? The only clue I could find was here: http://www.mwsug.org/proceedings/2012/SA/MWSUG-2012-SA07.pdf. On p. 14 there is a technique that includes creating a new weight variable. So, I could create a new weight:

    wt_new=wt_old*(a+.000000000001)*(b+.000000000001)* (c+.000000000001)

    and include on my svyset.

    Any thoughts?

  • #2
    Bill, you are misusing the phrase "domain variables" . A "domain" ( or "subpopulation" in Stata jargon) is subset of the population which you want to analyze without reference to other population members. I've never seen a study that stratified on gender but not region. So I'll switch their roles in my answer (stratify on region, but not gender). So: an analysis in males only or for females on requires a subpopulation prefix. A joint regression with both does not.

    1. If the sample was stratified on a variable (region), but not on gender, then you need only specify gender as a subpopulation.
    2. You need not specify gender as a subdomain in the model, as the model is not restricted to one gender.
    3. What exactly is the domain, in the strict sense of subset, being analyzed here? I don't see one, so the trick in the linked example doesn't apply.


    Note that Taylor Lewis's trick works when there is a subset-it's gives near zero weights to people not in the subset. .However the trick requires that observations from all sample members be included in the analysis. This is irksome if the whole sample is huge (think 14 years of NAMCS) and the number of subpopulation members relatively small. Austin Nichols showed how to augment the subpopulation observtions with a small number of observations outside the population, one from each PSU and Stratum. For details, see his post here .
    Last edited by Steve Samuels; 06 Nov 2015, 15:25.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Yes, Steve, your definition is the one with which I'm familiar. Someone asked whether the covariates in my regression represented domain variables as well, especially in light of the fact I might want to run contrasts. I never had thought abut that before, but my expertise is not really in survey analysis, so I thought I'd put the question here. The conclusion I'm hearing is that covariates in a regression do not represent domain variables.

      Comment


      • #4
        I agree with that conclusion, Bill.

        Steve
        Steve Samuels
        Statistical Consulting
        [email protected]

        Stata 14.2

        Comment

        Working...
        X