No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Two-level Logistic Regression with Complex Survey Design - Query


    I am relatively new to Stata (and Multilevel Modelling more broadly), and would be grateful for some support. My question is regarding running a 2-level logistic regression.

    More specifically, I am trying to run a 2-level logistic regression taking into account Complex Survey Design, but I'm not quite sure if my Stata code is correct. I have chosen to run a 2-level logistic regression (i.e. using -
    melogit- because I cannot use -svyset- with -xtlogit-). Please note, I am using Stata 15.1.

    My data is 2-level in that it reflects multiple observations over time (2009-2018) nested within an individual (identified by the pidp variable in the code below). My dependent variable, empstatus, is coded as 1=employed and 0=unemployed. My dataset comprises of approximately 160,000 observations.

    I have gone through a variety of material including the Stata Manual, and Rabeā€Hesketh, S., & Skrondal, A. (2006). Multilevel modelling of complex survey data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169(4), 805-827, but haven't always fully understood them. Particularly when there is a discussion around needing to have weights at both levels.

    I have also looked at previous questions raised, but haven't managed to find advice on a specific way to get Stata to run the model I want. I have found advice on applying weights in a multilevel logit model, for example through the following code,
    melogit empstatus i.gender age i.race[pweight=h_indinui_lw] || pidp: , allbase
    However, unless I am mistaken such code doesn't also take into account clustering and stratification (identified through my strata' and psu variables in the code below) which is what I would like to do.

    From all the code I have played around with, the below seems to make the most sense to me (but as I said I'm not 100% certain it is indeed doing what I want).

    Step1: Communicate the Complex Survey Design

    svyset, clear
    svyset psu, strata(strata) weight(h_indinui_lw) singleunit(scaled)
    Note: If I were using the -xtreg- command I would normally run the -svyset- command as follows:

    svyset, clear
    svyset psu [pweight = h_indinui_lw], strata(strata) singleunit(scaled)
    But that generates an error when I then run the -melogit- command after, therefore I specified -svyset- differently in this case.

    Step 2: Run the 2-level logistic regression taking into account Complex Survey Design

    svy: melogit
    empstatus i.gender age i.race|| pidp: , or allbase

    The command runs with no error in Stata 15.1, but I'm not sure if the output reflects what I actually think I am running. Therefore, I would be very grateful if you could advise:

    (i) if the code above (
    Step 1 and Step 2 combined) is indeed asking Stata to run a 2-level logit regression taking into account Complex Survey Design (i.e. clustering, weights, and stratification all at the same time)

    (ii) if there is a book/article you can point me to that goes through multilevel logistic regression with a specific focus on how to apply Complex Survey Design using Stata, I would be very grateful.

    Many thanks in advance for your help,

    Samir Sweida-Metwally
    Last edited by Samir Sweida-Metwally; 23 Aug 2019, 10:17.