Multi-Level model - how to define it in STATA?

Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#1

Multi-Level model - how to define it in STATA?

19 Jun 2018, 16:19

Hello

I am having difficulties figuring out how to run a FE and RE models on STATA for my multi-level model. A bit of background, this is a health care research question. I am looking at variation in patient reported quality of life before and after treatment (which in this case is a drug) in different hospitals in order to infer performance. Patients take a drug aiming at relieving their specific clinical problem. If a drug does not seem to work for them, doctors switch patients to another drug. Patients provide information about their quality of life at baseline and then after 6 months for each drug. It happens that some patients end up taking different drugs and so they report quality of life measures related to each drug. So, I have episodes (different drugs) clustered in patients who are clustered in hospitals. My model is quality_of_life_6months_ijt = controls_ijt + quality_of_life_0months_ijt + hospital effect_it + error term_ijt. I need to use FE or RE to calculate hospital effect. I once run a model of the effect of marriage on mental health for a panel data in which I had as subscripts it, i being individual_id and t year, so I did in STATA xtset individual_id year, but now I need something like xtset hospital_id individual_id number_of_drugs but this does not work. I don't have time as a variable now, I have quality_of_life_0months and quality_of_life_6months in the model separately. I don't want to use xtmixed, I need to use xtreg, fe or xtreg, re. Can you please help on how to do this in STATA?

Another possibility, generalised linear model. Can you please help on how to do this in STATA?

Thank you very much for all the help. Really appreciate it.

Oliveira
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

19 Jun 2018, 16:48

Welcome to the Stata Forum /Statalist ,

If you have two or more levels, you need a mixed model. Please be aware that, - mixed - is the current command, instead of xtmixed. Please prefer to write Stata, not with all capital letters, as remarked in the FAQ.

Best regards,

Marcos
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#3

19 Jun 2018, 17:13

Hi Marcos

Thank you very much for your reply and sorry about the capitals. I went through some things to do before posting but I guess I didn't see that note. Based on what you are telling me I have 2 questions:

1- Since I don't think I can run FE in a mixed model but I need to run a FE model, how should I do it?

2-What if my model only has 2 levels: hospital_id individual_id? I don't think I can use xtset hospital_id individual_id anyway because xtset only allows me for panelvar timevar or just panelvar. How to run a FE for these 2 levels?

I am new to these models and Econometrics, these questions may be basic but I am struggling and would appreciate any help.

Thank you!
Oliveira
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#4

19 Jun 2018, 18:53

These are somewhat complex models. What is more, with 3 levels. Add to it that you wish the FE-xtreg-like approach.

To start, I didn’t understand why an FE-xtreg-like model should be the only way to tackle your 3-level analysis.

Second, even with the ‘right’ command, we may need more than that so as to go further with the analysis and interpretation.

These aspects being underlined, maybe this thread will interest you: https://www.statalist.org/forums/for...ce-using-mixed

.

Best regards,

Marcos
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#5

20 Jun 2018, 11:10

You might also look at reghdfe (user written) and the multilevel mixed effects documentation and estimators.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

20 Jun 2018, 11:41

Sofia:
you seem to have a sample of patients nested (not clustered) within hospitals who receive one or more drugs during the study (I do not see drugs administration as a further nesting level).
From your description, as previous helpful replies underscored, the recommended approach would be -mixed-, which holds a tight relationship with -xtreg, re mle- (hence, a random effect specification seems the way to go).
A panel data regression with a fixed effect specification (-xtreg,fe-) focuses on the difference within the same panel (ie, each patient in your case) but get rid of time-invariant predictors (for instance, if patients are hospitalized or refer to the same hospital for the entire study duration, what you call "the hospital effect" will obtain no coefficient in your regression).
Choosing between -xtreg, fe- and -xtreg, re- may be tricky. However, some tests do exist.
If you plan to use default standard errors, you can rely on -hausman- test to support your choice, whereas the user-written programme -xtoverid- (type -search xtoverid- from within Stata and follow the instruction to install it) comes in handy if you invoke cluster/robust standard errors.

Kind regards,
Carlo
(Stata 19.0)
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#7

22 Jun 2018, 05:04

Hi

Thanks Marcos, Phil and Carlo for your replies, and sorry for the delay.

Marcos I have 3 quick questions if you don't mind.

1- You mentioned "I do not see drugs administration as a further nesting level", but how am I going to account for that then? Because each patient with multiple drugs will have other quality_of_life variables associated to it. For example, quality_of_life_0months_drug1, quality_of_life_6months_drug1, quality_of_life_0months_drug2, quality_of_life_6months_drug2, etc.

2- you are right, I get coefficients = 0 if I run a FE model with hospital dummies and it makes sense as you explained. However, if I run a OLS regression like this:
reg quality_of_life_6months quality_of_life_0months i.Hospitals $controls, cluster(NumCentro) base
I get a coefficient for each hospital. Can you help me interpret this regression? What am I actually doing here? Could this be a kind of "FE" model?

3- I am following a paper where they have a similar model as mine but a 2 level model i individual and j hospital (not 3-level with the drugs), and then run FE, RE (Maximum likelihood) and RE (Generalised least squares). So, I would like to run these 3 models and then compare and discuss pros and cons.
How can I run a FE for a 2-level model, is it possible?

As you all suggested before, I will use "mixed" for RE (Maximum likelihood).

Could you please clarify which command should I use for RE (Generalised least squares) in Stata?

Thank you very much for all the help, I really appreciate it!
Oliveira
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#8

22 Jun 2018, 05:05

Apologies, I meant Carlo not Marcos, because I am replying directly to your comment. But of course any help is welcome!

Thank you!
Oliveira
Comment
Hassen Ali

Join Date: May 2018

Posts: 39
#9

22 Jun 2018, 05:34

Thank you very much all, I have learned a lot from your posts!!
Cheers,Hassen
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

22 Jun 2018, 07:38

Sofia:
1) you have repeated waves of health-related quality of life data for each patient: this is, in brief, the menaiong of a panel dataset. However, a warning should be kept in mind: in panel data repeated measures are planned at scheduled points in time. If your dataset does not follow this approach, things might be trickier than they appear.
2) -regress- is more similar to -xtreg, re- than to -xtreg,fe- as it focus on the difference between patients. That said, if your panel units are patients, standard errors should be clustered at patient level; however, you can include a predictor -i.NumCentro- in the right-hand side of your regression equation;
3a) my gut feeling is to stick with a two-level -mixed- model, which, as I wrote in my previous reply, is similar to -xtreg, re mle-;
3b) I'm not sure I got what you mean by two-level FE model, but, as Phil recommended, take a look at the user-written programme -reghdfe- (type -search reghdfe- from within Stata to spot and install it);
3c) GLS RE panel data regression translated in Stata is -xtreg, re-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#11

24 Jun 2018, 11:24

Hi Carlo

When you say "1) you have repeated waves of health-related quality of life data for each patient: this is, in brief, the menaiong of a panel dataset. However, a warning should be kept in mind: in panel data repeated measures are planned at scheduled points in time. If your dataset does not follow this approach, things might be trickier than they appear."

- Yes my dataset follows this approach. For each patient I have quality of life at baseline and 6months. If patients had more than 1drug in their treatment, for the second drug I also have quality of life at baseline and 6months, etc. I am struggling to put my data in Stata in a format I can actually do something about it. For example I have:

observation | IDpatient | drug | wave | quality_life
1 | 12345 | 1 | 0 | 0.3
2 | 12345 | 1 | 6 | 0.4
3 | 12345 | 2 | 0 | 0.5
4 | 12345 | 2 | 6 | 0.7

If I want to start with a simple regression, If I do:
gen quality_life_baseline_drug1=quality_life if wave==0 & drug==1
gen quality_life_6months_drug1=quality_life if wave==6 & drug==1

I then run a regression reg quality_life_baseline_drug1 quality_life_6months_drug1 i.Hospitals $controls, cluster(NumCentro) base, I get error about "no observations" and I understand why (different observations, in this case 1 and 2 for my example will be used).

Moreover, if I do xtset IDpatient wave I get error saying:
repeated time values within panel r(451); After some research I understand it is because wave is repeated but I need wave to be repeated. I don't really know how to do this, because I need the data to be panel in order to run FE and RE. Could you please help? My main objective is to compare the variation in quality of life among different Hospitals taking into account patient characteristics.

Thank you very much for all your help, I really appreciate it.
Oliveira
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#12

24 Jun 2018, 11:38

Sofia:
the main issue with your data is that the number of quality of life assessment depends on the number of therapies administered to each patient, instead of being scheduled consistently with the number of the waves of data. This data layout differs from an ideal panel.
You may want to consider collapsing the mean value of quality of life per patient at each assessment, no matter the number of administered drugs (see -help collapse-).
This approach will allow you to have a unique measure of quality of life per patient per each wave of data.
In order not to lose all the information related to drug administration, you can consider creating a predictor reporting the number of therapies administered to each patient per each wave of data.
You can skip the reported -xtset- nuisance by -xtset-tting your data with -panelid- only; this will work as long as you do not plan to use time-series commands, such as lags and leads, in youir regression.

Kind regards,
Carlo
(Stata 19.0)
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#13

24 Jun 2018, 12:00

Thank you for the quick reply!

Ok, the approach you described will allow me to have a unique quality of life measure per wave but since these will be 2 different observations (one for baseline quality_life, another for 6months) I will not be able to run:
reg quality_life_baseline_drug1 quality_life_6months_drug1 i.Hospitals $controls, cluster(NumCentro) base, which is the model I want to run.

Based on your comment "You can skip the reported -xtset- nuisance by -xtset-tting your data with -panelid- only; this will work as long as you do not plan to use time-series commands, such as lags and leads, in youir regression." I am thinking I could reconstruct my dataset to be like this:
observation | IDpatient | drug1_quality_life_baseline | drug1_quality_life_6months | drug2_quality_life_baseline | drug2_quality_life_6months
and this way be able to have an observation per patient. Do you foresee any troubles with this? I will still need to use mixed and xtreg, re.
A question I have here is if I put controls in my model with this dataset construction, I will have to put patient controls = age_baseline, age_6months, other_diseases_baseline, other_diseases_6months... What do you think?

Thank you very much again!
Oliveira
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#14

24 Jun 2018, 15:38

Sofia:
as per your last post, I fail to get whether your data are in -wide- or -long- format (please note that the latter is highly advisable to run most Stata commands, included -xtreg- and -mixed-).

Kind regards,
Carlo
(Stata 19.0)
Comment
Sofia Oliveira

Join Date: Jun 2018

Posts: 8
#15

24 Jun 2018, 16:06

Hi Carlo

Thanks for all your time helping me on this. My data are in long format at the moment, sorry for failing to explain that.

Thank you
Oliveira
Comment

Announcement

Multi-Level model - how to define it in STATA?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment