Tobit model

Maye Ehab

Join Date: Mar 2017

Posts: 58
#1

Tobit model

08 Jun 2017, 02:59

Dear Statalisters,

I am conducting an analysis on commuting time and was checking if the data is censored or not.
So, I tabulated the missing commuting time for wage workers

Code:

tabulate crempstp crtrvtmp if missing(crtrvtmp), missing

I found the missing values are 1.6 percent of the data and there is no zero commuting time. Those who have one minute of commuting time represent 0.47 percent.

Is this an enough justification for the use of a tobit model?

Thank you.
Maye

Last edited by Maye Ehab; 08 Jun 2017, 03:10.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

08 Jun 2017, 03:30

Maye:
have you any sound reason to assume that missing values are, in fact, 0 commuting time?

Kind regards,
Carlo
(Stata 19.0)
Comment
Maye Ehab

Join Date: Mar 2017

Posts: 58
#3

08 Jun 2017, 05:09

for most of the missing values, the working district is the same as the residence district.
Would this be enough justification?

On another note, how can I count the frequency of having the same working and residence district if the value of commuting time is missing.
I have done the below command but it is not what I need:

Code:

bysort year: tabulate workdistrict residencedistrict if missing(commuting)

Many thanks,
Maye
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#4

08 Jun 2017, 05:15

Maye:
then you can probably replace the missing values with zero and use -regress-.
A remark relates to -year-: are you dealing with a cross-sectional or a panel dataset?
You may probably add conditions to your code:

Code:

bysort year: tabulate workdistrict residencedistrict if missing(commuting) & workdistrict==residencedistrict

If what above does not help, please post an example/excerpt of your data via -dataex- (please type -search dataex- from within Stata to install it). Thanks

Kind regards,
Carlo
(Stata 19.0)
Comment
Maye Ehab

Join Date: Mar 2017

Posts: 58
#5

08 Jun 2017, 05:17

Carlo,
Another question, if we agree to assume that the missings are actually zeros. So, I will recode them as zeroes and this will be my lower limit for conducting the tobit model. Am I right?

Many thanks,
Maye
Comment
Maye Ehab

Join Date: Mar 2017

Posts: 58
#6

08 Jun 2017, 05:22

Thank you for your reply.
It is a panel data for two years.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#7

08 Jun 2017, 05:24

Maye:
I would prefer -xtreg- vs -xttobit-.
You do not seem to have censored/undetectable data: your sample is simply composed of a small share of people that do not commute (and so their commuting time is zero).
I'm one of the kind: my home and my office are located in the same building and whenever the Italian Institute for Statiscs surveys me about commuting time I usually answer: 0.5 minute (that is, the time to walk one floor downstairs from my home to my office).

Kind regards,
Carlo
(Stata 19.0)
Comment
Maye Ehab

Join Date: Mar 2017

Posts: 58
#8

08 Jun 2017, 05:44

I counted them using your suggested code. 58% are suggested to be zero since they are having same work and residence district while the rest are unexplained missing observations.

Many thanks, Carlo.
Maye
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#9

08 Jun 2017, 06:55

Maye:
the unexplained missing observation may be an issue too, unless their missingness is uninformative.
If the latter were the case, you may want to take a look at -ipolate- or -mi-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment