Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using XT commands with Svy commands

    Hi Everyone,

    I am trying to perform a longitudinal analysis of Medical Expenditure Panel Survey data, and I'm just now finding out that Stata does not support XT models (xtreg, xtpoisson, xtprobit, xtgee, etc.) while using svy commands. Does anyone have any advice for how to analyze these data while taking into account the complex survey design (iweight, psu, strata) in the standard errors?

    Thanks!
    Ryan

  • #2
    The multilevel mixed-effects models all support the svy: prefix.

    Comment


    • #3
      Thanks for the response, Clyde. I'm going to expose my ignorance here, but what are the implications of using a mixed effects model versus just a standard xtreg, fe or an xtreg, re?

      Comment


      • #4
        The mixed effects models are more or less equivalent to their -xt..., re- counterparts, although they allow more levels and also allow for random slopes. The estimation algorithms are different, so the results are not always exactly the same, but they are close. They don't do anything equivalent to the -xt..., fe- models, however. And they can be computationally intensive, and sometimes it takes a lot of patience to tweak the model into converging.

        I would say that if you are working with MEPS data, fixed effects models would be very cumbersome in any case because the number of respondents is very large. But I think far a bigger problem with this kind of data than -fe vs -re- is just model specification. The expense variables will have difficult-to-work-with distributions, with a lot of zero inflation, as well as some pretty extreme outliers. Normal-theory will not really apply. Using -mepoisson- or -menbreg- might be the way to go. Of course, I don't know what your particular goals are, and I might be off base here, but just wanted to give you some general advice.

        Comment


        • #5
          Thanks again Clyde for your response. What would be the effect on my estimates if instead of using SVY: mepoisson and adjust for the PSU, Strata, and use the weight, I instead did not use the survey commands, and used xtpoisson and only use vce(cluster clustvar) and used the weights? I guess what I'm asking is, how important is it to include the Strata variable when adjusting the models for survey design?

          Comment


          • #6
            Well, the absolutely most important thing is to use the pweights. If you do unweighted regressions on survey data your coefficients will be biased, potentially very severely so.

            The strata and psu's affect the standard errors, not the coefficient estimates. Using vce(cluster) is getting, in a vague sense, at the effect of primary sampling units, but isn't really the same thing. But in no way does vce(cluster) emulate the effect of strata. There are no general rules that tell you just how far off your standard errors will be if you ignore the strata and psu's in your analysis. The ratio between standard errors calculated with full survey adjustments and those calculated as if the data were a simple random sample is known as the design effect. It can, in principle, be any positive number. And even though I have only limited experience using survey data in my career, I have seen values both much bigger or much less than 1, and also someclose to 1, So it's hard to know if the problem will be mild or severe, or even in which direction you might go wrong.

            With regard specifically to strata, it is possible to say this. The rationale for doing stratified sampling is often that with the same total sample size, stratified sampling produces smaller standard errors, i.e. more efficient estimates, than simple random sampling. In order for this to actually work out in practice, the strata should differ in the distribution of the outcome variable. In many surveys, the strata are, in fact, selected with this in mind. So ignoring them could lead to substantial overestimates of the standard errors. But if you are analyzing an outcome variable that is different from the one the survey designers were planning for, and if that outcome variable has the same distribution across the strata, then there could be less harm done.

            But, again, at the end of the day, it's hard to know this other than by doing the analysis both ways and seeing how much difference it made, which probably isn't what you want to do. I would advise against ignoring any aspects of the survey design here. Sometimes there is no alternative--the data set doesn't provide the information, or there simply is no software that will do what is needed. But in your situation, the necessary data and commands are at your disposal.

            Comment


            • #7
              To get proper standard errors for subpopulations, you'll need to svyset. Then for smany svy estimation commands, you'll add a subpop() option. For other survey commands that operate on groups, e.g. using over, Stata will supply the proper subpopulation adjustment. In any case, using if to subset an analysis can produce incorrect standard errors, usually too large. That said, to properly weight the me commands, you need to specify weights at each level. See p. 59 of the SVY Manual and p 80 of the ME manual.
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment

              Working...
              X