I'm trying to fit a mixed effects logistic regression model. My data set contains information about approximately 8,000 care providers and about 1.2 million patients in a multiple membership model.
The standard way to do this in Stata is with crossed random effects. But a fully crossed random effects model would would involve almost 10 billion patient-provider dyads--and, unsurprisingly, when I try to fit this model, Mata complains that the operating system will not give it the memory needed for a matrix that large. (My installed RAM is 64 GB.) In my data, however, of these 10 billion potential dyads in a fully crossed model, only about 4.8 million actually occur, as most of the patients are seen by only a few of the doctors.
Is there any way to fit this model in Stata? Is there perhaps a user-written command that would tackle the calculation using only the amount of memory needed for the instantiated dyads? Would -meqrlogit- do this? Or is there some way to decompose the data set into chunks, estimate separately, and then somehow recombine the results?
My best solution so far is to use a single random effect for the dyad. But, implicitly, this reflects a patient-provider interaction which, given the particular nature of the outcome being studied I do not expect to exist. Since I don't really need separate estimates of provider and patient variance components to answer my research question, this does no harm. But if I really needed separate provider and patient variance component estimates, does anybody know of a way to do it?
The standard way to do this in Stata is with crossed random effects. But a fully crossed random effects model would would involve almost 10 billion patient-provider dyads--and, unsurprisingly, when I try to fit this model, Mata complains that the operating system will not give it the memory needed for a matrix that large. (My installed RAM is 64 GB.) In my data, however, of these 10 billion potential dyads in a fully crossed model, only about 4.8 million actually occur, as most of the patients are seen by only a few of the doctors.
Is there any way to fit this model in Stata? Is there perhaps a user-written command that would tackle the calculation using only the amount of memory needed for the instantiated dyads? Would -meqrlogit- do this? Or is there some way to decompose the data set into chunks, estimate separately, and then somehow recombine the results?
My best solution so far is to use a single random effect for the dyad. But, implicitly, this reflects a patient-provider interaction which, given the particular nature of the outcome being studied I do not expect to exist. Since I don't really need separate estimates of provider and patient variance components to answer my research question, this does no harm. But if I really needed separate provider and patient variance component estimates, does anybody know of a way to do it?