How to I simulate a random intercept and slope mixed poisson model

Andrew Xavier

Join Date: Jul 2018

Posts: 57
#1

How to I simulate a random intercept and slope mixed poisson model

13 Jul 2021, 07:53

Hi,

Please could you tell me how do I simulate a random coefficient mixed poisson model?

How do I simulate the model below:

Code:

webuse epilepsy, clear mepoisson seizures i.treat c.visit || subject: visit, irr

Is the following type of code correct in simulated a mixed random effects model poisson? [i don't think so]

Code:

clear set obs 200 gen id=_n expand 4 bysort id: gen visit=_n gen treat = rbinomial(1,0.5) gen re = sqrt(.80)*rnormal() sum re bysort id: replace re = re[1] gen rc = sqrt(.50)*rnormal() sum rc bysort id: replace rc = rc[1] gen xb=ln(6)+ln(.70)*visit+ln(.90)*treat + (re + rc*visit) gen mu = exp(xb) gen y = rpoisson(mu) mepoisson y visit treat || id: visit, irr

Many thanks

A

Last edited by Andrew Xavier; 13 Jul 2021, 08:04.
Tags: count, mepoisson, mixed, poisson, simulate
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

13 Jul 2021, 13:26

Well, it is a valid simulation, but what it simulates is a rather bizarre data generating process that I don't think you intended. Moreover, this will take a really long time to converge, if it ever does.

The reason it probably will never converge, and what makes it a bizarre process is that treatment randomly gets turned on and off at each visit. That's going to make estimating the effect of visit, especially when finely ramified as a random slope, very difficult.

I wonder if you didn't intend to have a two arm model, wherein each patient is either treated at all visits, or never treated. That could be simulated by moving the -gen treat = ...- statement up before the -expand 4- command. Or perhaps you have in mind a cross-over design where one arm gets treated in visits 1 and 2, and the other gets treated in visits 3 and 4. That would be fairly simple to code, this time after -expand 4-. (Left as exercise for the reader.)
Comment
Andrew Xavier

Join Date: Jul 2018

Posts: 57
#3

13 Jul 2021, 13:59

Dear Clyde,

You are right! I did not realise the problem in my code: I meant treatment to be constant within each patient

Re: this explains why the model does not converge. However, I am also realising that even when I fix the treatment by generating it before expand command, the mepoisson still struggles with convergence. Do you know why?

Again, many thanks!

Andrew
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#4

13 Jul 2021, 14:13

Well, I tried running it on my setup and, although it took a fairly large number of iterations, it did converge, and the estimates produced were approximately equal to the parameters you specified in your code.

Random slope models can be difficult to estimate. Particularly with only 4 observations per id, you don't have a a lot of information about the visit slope, and even less for random slope estimation.

By the way one more thing that is a "best practice" for simulations. In most situations, you want your results to be reproducible when you rerun your code. For that to happen, you have to set the seed for the random number generator before you use it the first time. I've formed the habit of doing it right before the -set obs- command. That way I don't forget, or accidentally put it after a use of the random number generator. It doesn't matter what seed you pick. You just want to record it in the code so that each time you run the code, the random number generator starts out in the same place and re-generates the same sequence of random numbers.
Comment
Andrew Xavier

Join Date: Jul 2018

Posts: 57
#5

13 Jul 2021, 14:14

Dear Clyde,

Many thanks for your suggestion. I will add seed to my simulations. And yes, the problem of visit is a problem and it makes sense why it is difficult to converge.

Please could you show me how I simulate a mixed random intercept and slope negative binomial model?

Many thanks!

Andrew

Last edited by Andrew Xavier; 13 Jul 2021, 14:17.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#6

13 Jul 2021, 14:51

It's not all that different. The complication arises in parameterizing the negative binomial distribution function that replaces the poisson distribution. So, you have already calculated mu. You also need to choose a dispersion parameter, r, which determines just how far from Poisson the particular negative binomial is. So you have to calculate:

Code:

local r = some_non_negative_real_number_you_choose local p = `r'/(mu+`r') gen y = rnbinomial(`r', `p')
Comment

Andrew Xavier

Join Date: Jul 2018
Posts: 57

14 Jul 2021, 07:09

Many thanks,

I implemented your code and searched about this online

I found this paper online and tried to adapt the code.

Hilbe, Joseph M. "Creating synthetic discrete-response regression models." The Stata Journal 10.1 (2010): 104-124.

I tried 3 different ways to generate the negative binomial but I do not know the difference in terms of which one to use (and differs from the code you recommended). Please could you clarify the different between them?

Code:

set seed 4334
clear
set obs 500

gen id=_n

gen treat = rbinomial(1,0.5)

expand 4

bysort id: gen visit=_n

gen re = sqrt(2)*rnormal()
sum re
bysort id: replace re = re[1]

gen rc = sqrt(0.15)*rnormal()
sum rc
bysort id: replace rc = rc[1]

gen xb=ln(3)+ln(0.95)*visit+ln(0.85)*treat + (re + rc*visit)

generate exb = exp(xb)

generate delta3 = .2
gen p= delta3/(exb + delta3)
gen nby3 = rnbinomial(delta3, p)

menbreg nby3 visit i.treat || id: visit, irr difficult

display 1/exp(_b[/lnalpha])

//==========================

generate delta = .2
generate idelta = 1/delta
generate xg = rgamma(idelta, delta)
generate xbg = exb*xg
generate nby = rpoisson(xbg)

menbreg nby visit i.treat || id: visit, irr difficult

//==========================

generate delta2 = .2
generate idelta2 = (1/delta2)*exb
generate xg2 = rgamma(idelta2, 1/idelta2)
generate xbg2 = exb*xg2
generate nby2 = rpoisson(xbg2)

menbreg nby2 visit i.treat || id: visit, irr difficult

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#8

14 Jul 2021, 09:33

The first of the three methods you show is exactly the same as mine, just in different notation. Substitute delta3 for `r', and exb for mu, and `p' for p, and you will see they are the same. (The code you show uses variables instead of local macros, but that makes no differences because these "variables" are actually constants." It's a harmless waste of storage to use variables to hold constants.

The second two that you show are different from mine, but are the same as each other, again using different notation, but they end up calculating exactly the same things. However, I am only vaguely familiar with this approach--I may have once used it a long time ago. If you plan to do a simulation with some gargantuan data set, then the second two methods will be slower because they require two random samples instead of one. But it would have to be a truly enormous data set for the performance difference to really be noticeable.
Comment
Andrew Xavier

Join Date: Jul 2018

Posts: 57
#9

14 Jul 2021, 10:26

Yes, the method that generated nby3 is the one you recommended.
The methods that generated nby and nby2 are not the same because in one we have

Code:

generate exb = exp(xb) generate idelta = 1/delta generate xg = rgamma(idelta, delta)

and in the other we have

Code:

generate exb = exp(xb) generate idelta2 = (1/delta2)*exb generate xg2 = rgamma(idelta2, 1/idelta2)

And they both lead to slightly different estimates Also, I find the code that generates nby seems to be the best in returning very similar results to the model parameters used in the simulation. The approach you recommended (nby3) gives very approximate but not always similar estimates to the initial parameters even when I generate a very large sample size Many thanks
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#10

14 Jul 2021, 11:07

You are right, they are not exactly equivalent. The first of the two is the one that seemed vaguely familiar. The second is not one I have seen or used before, although it does produce similar results.

Very interesting. Thanks for sharing those.
Comment

Announcement