Hi all,
I'm working with a parametric hazard model (using streg) where I'm trying to model acquirement of drivers license based on 8 years data. The data is based on a random sample of all individuals. All individuals are assumed to get at risk at the age of 18. My data is a random sample for years between 2003 to 2011 that gives maximum of 8 rows of data per individual and I have some 27,000 individuals in the data set.
I'm stset-ing the data using:
stset year, failure(drv_lic==1) origin(year18) enter(entry_year) id(id),
drv_lic =0 if the individual dont' have drivers license and 1 otherwise.
year18 is a year that the individual turned 18 years old and became at risk
entry_year is a variable indicating the first year the individual is observed in the data
I've been reviewing the literature on left truncated data and left and right censored data. In my case I don't have left-censored data since I know when all individuals became at risk (the year they turned 18) so individuals who turned before 2003 (which is the year where my data set starts at) are left truncated but are included in my data. I don't have interval censored problem either, however there are individuals who leave the data set before they acquire drivers license (pass away or migrate to other countries or simply has not acquired the license by the end of the period, 2011), so I have right censored data.
My question is:
1- does suing streg (after my stset) adjust the loglikeliehood for the left truncation bias?
2- if not, how should I proceed to do so
3- how is the right censored data treated with streg? will I have bias using streg as a result of right censoring?
Should I use other model descriptions than streg? (I need to make a parametric model)
Regards
Sia
I'm working with a parametric hazard model (using streg) where I'm trying to model acquirement of drivers license based on 8 years data. The data is based on a random sample of all individuals. All individuals are assumed to get at risk at the age of 18. My data is a random sample for years between 2003 to 2011 that gives maximum of 8 rows of data per individual and I have some 27,000 individuals in the data set.
I'm stset-ing the data using:
stset year, failure(drv_lic==1) origin(year18) enter(entry_year) id(id),
drv_lic =0 if the individual dont' have drivers license and 1 otherwise.
year18 is a year that the individual turned 18 years old and became at risk
entry_year is a variable indicating the first year the individual is observed in the data
I've been reviewing the literature on left truncated data and left and right censored data. In my case I don't have left-censored data since I know when all individuals became at risk (the year they turned 18) so individuals who turned before 2003 (which is the year where my data set starts at) are left truncated but are included in my data. I don't have interval censored problem either, however there are individuals who leave the data set before they acquire drivers license (pass away or migrate to other countries or simply has not acquired the license by the end of the period, 2011), so I have right censored data.
My question is:
1- does suing streg (after my stset) adjust the loglikeliehood for the left truncation bias?
2- if not, how should I proceed to do so
3- how is the right censored data treated with streg? will I have bias using streg as a result of right censoring?
Should I use other model descriptions than streg? (I need to make a parametric model)
Regards
Sia
Comment