Dear Stata Users,
I am encountering a problem with converging a melogit-model. I have a longitudinal sample of the SHARE survey (Survey on Health, Aging and Retirement in Europe) in where respondents (variable: mergeid) were asked multiple times across waves (variable: wave) in different countries (variable: country). I want to assess if tobacco consumption (binary variable: pres_smoke) was somehow influeced across time and changed.
Therefore, I want to do a multilevel model with mergeid at the 2nd level and country and the 3rd level. I am actually not so much interested in differences between the participants, but since my data is not independent, because respondents were asked multiple times, I have to specify that in my model so that the assumption of independence for my logistic regression is not violated.
However, even for my simplest model, in where I only include the dependent variable for smoking and the ID-variable at the 2nd level, I am getting an error term that the model cannot be improved and a flat region is encountered. In the forum I already read some great tips, but so far they did not work for my model.
So for example, I already used a logit model before and saved the values of it in a matrix to use them as starting values. I also tried to increace the tolerance value and tried different techniques.
I have the feeling that I am encountering problems because the number of groups is very high, because it is a very large dataset. The number of respondents is around 130 000 people with multiple measures, so I have around 130 000 groups for mergeid at the higher level. This also the reason why I tried different techniques like the method of Broyden-Fletcher-Goldfarb-Shanno (bfgs), but it did not help so far.
Also the meqrlogit model did not work:
Another reason could be that there are not so many differences between the respondents. Like I said, based on my hypotheses I am not really interested in the random effect for the respondents, but for my understanding I need to include the ID variable so that I do not violate the assumption of independence. I know that I could calculate other models instead of the multilevel model, but I would like to do a multilevel analysis because I want to include variables at the country level as a next step.
So far, I am out of ideas at the moment, and I am very much looking forward to any ideas of other Stata users. I apologize for not using the dataex command, but the data of SHARE is confidential and I am not allowed to share it anywhere.
Kind regards,
Josefine
I am encountering a problem with converging a melogit-model. I have a longitudinal sample of the SHARE survey (Survey on Health, Aging and Retirement in Europe) in where respondents (variable: mergeid) were asked multiple times across waves (variable: wave) in different countries (variable: country). I want to assess if tobacco consumption (binary variable: pres_smoke) was somehow influeced across time and changed.
Therefore, I want to do a multilevel model with mergeid at the 2nd level and country and the 3rd level. I am actually not so much interested in differences between the participants, but since my data is not independent, because respondents were asked multiple times, I have to specify that in my model so that the assumption of independence for my logistic regression is not violated.
However, even for my simplest model, in where I only include the dependent variable for smoking and the ID-variable at the 2nd level, I am getting an error term that the model cannot be improved and a flat region is encountered. In the forum I already read some great tips, but so far they did not work for my model.
So for example, I already used a logit model before and saved the values of it in a matrix to use them as starting values. I also tried to increace the tolerance value and tried different techniques.
Code:
bysort mergeid: gen mergeid_n = _n ==1 replace mergeid_n = sum(mergeid_n) xtset mergeid_n wave logit pres_smoke mergeid_n matrix b = e(b) melogit pres_smoke || mergeid_n: ,or from(b, skip) tolerance(1e-15) technique(bfgs) /*iter(500)*/
Code:
bysort mergeid: gen mergeid_n = _n ==1 replace mergeid_n = sum(mergeid_n) xtset mergeid_n wave logit pres_smoke mergeid_n matrix b = e(b) melogit pres_smoke || mergeid_n: ,or from(b, skip) tolerance(1e-15) technique(nr 5 bhhh 5) /*iter(500)*/
Also the meqrlogit model did not work:
Code:
meqrlogit pres_smoke || mergeid_n: ,or from(b, skip) technique(bfgs) difficult /*iter(500)*/
Another reason could be that there are not so many differences between the respondents. Like I said, based on my hypotheses I am not really interested in the random effect for the respondents, but for my understanding I need to include the ID variable so that I do not violate the assumption of independence. I know that I could calculate other models instead of the multilevel model, but I would like to do a multilevel analysis because I want to include variables at the country level as a next step.
So far, I am out of ideas at the moment, and I am very much looking forward to any ideas of other Stata users. I apologize for not using the dataex command, but the data of SHARE is confidential and I am not allowed to share it anywhere.
Kind regards,
Josefine
Comment