Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • estimate discrete-time event history models

    Hi. I want to to estimate discrete-time event history model to assess personal characteristics influence refugees’ entry into language course following them from arrival till 5 years. So when refugee attend language course then got event as 1 and also time is assigned to when they attend language course. data is in long format with time varying vars of education. after doing survival function as "stset time, failure(event) id(id)". now to estimate discrete-time event history model "logit event c.time i.edu"....now I got error that "outcome does not vary. 0= negative outcome, all othernonmissing values=positive outcome". I have attched a very smaple of my data. I am not allowed to attach full data.

    id observation period event time edu
    a 1 0 . low
    a 2 0 . low
    a 3 0 . low
    a 4 1 4 high
    a 5 0 . high
    b 1 0 . high
    b 2 0 . high
    b 3 0 high
    b 4 0 high
    b 5 0 high
    c 1 0 low
    c 2 1 2 low
    c 3 0 . high
    c 4 0 . high
    c 5 0 . high

  • #2
    If you set variable time to missing where failure not occurred, then when you run logit model it will drop all the observations that have missing value on variable time. Thus your dependent variable event (failure) only takes value of 1.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2 id float(obsperiod event time) str4 edu
    "a" 1 0 . "low"
    "a" 2 0 . "low"
    "a" 3 0 . "low"
    "a" 4 1 4 "high"
    "a" 5 0 . "high"
    "b" 1 0 . "high"
    "b" 2 0 . "high"
    "b" 3 0 . "high"
    "b" 4 0 . "high"
    "b" 5 0 . "high"
    "c" 1 0 . "low"
    "c" 2 1 2 "low"
    "c" 3 0 . "high"
    "c" 4 0 . "high"
    "c" 5 0 . "high"
    end
    
    logit event c.time i.edu
    // Stata drops observations that have a missing value for one or more of the variables in the model.

    Comment


    • #3
      -stset- is irrelevant for fitting discrete time models (the -st- suite is for continuous time modelling). You might like to look at http://www.iser.essex.ac.uk/survival-analysis for a gentle introduction to discrete time modelling (also continuous time modelling). (Materials are free to download.)

      Comment


      • #4
        Chen SamulsionThanks for your answer. yes but what is solution to that?

        Comment


        • #5
          Stephen Jenkins thanks for your suggestion

          Comment


          • #6
            Your example data is too small to get illustrative result of logit regression, especially when time variable predicts failure perfectly and thus is to be dropped by the logit model. Note that I used -fillmissing- command to quickly tackle some problem below, and it should be installed by ssc. Although sociologist like using logit model to address discrete-time survival analysis, scholars from other disciplines maybe like using discrete-time proportional hazards model (cloglog, pgmhaz). By the way, Stephen Jenkins is a real expert on survival analysis, he will give you professional advice.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str2 id float(obsperiod event time) str4 edu
            "a" 1 0 . "low" 
            "a" 2 0 . "low" 
            "a" 3 0 . "low" 
            "a" 4 1 4 "high"
            "a" 5 0 . "high"
            "b" 1 0 . "high"
            "b" 2 0 . "high"
            "b" 3 0 . "high"
            "b" 4 0 . "high"
            "b" 5 0 . "high"
            "c" 1 0 . "low" 
            "c" 2 1 2 "low" 
            "c" 3 0 . "high"
            "c" 4 0 . "high"
            "c" 5 0 . "high"
            end
            
            ssc install fillmissing
            bysort id (obsperiod): replace time=_n if time==.
            bysort id (obsperiod): gen N=_n if event==1
            bysort id: fillmissing N, with(max)
            bysort id: drop if time>N
            encode edu, gen(edu2)
            logit event i.time i.edu2
            
            . logit event i.time i.edu2
            
            note: 1.time != 0 predicts failure perfectly
                  1.time dropped and 3 obs not used
            
            note: 3.time != 0 predicts failure perfectly
                  3.time dropped and 2 obs not used
            
            note: 5.time != 0 predicts failure perfectly
                  5.time dropped and 1 obs not used
            
            note: 4.time omitted because of collinearity
            Iteration 0:   log likelihood = -3.3650583  
            Iteration 1:   log likelihood = -2.8518168  
            Iteration 2:   log likelihood = -2.7900415  
            Iteration 3:   log likelihood = -2.7764658  
            Iteration 4:   log likelihood = -2.7733623  
            Iteration 5:   log likelihood = -2.7727177  
            Iteration 6:   log likelihood = -2.7726038  
            Iteration 7:   log likelihood = -2.7725919  
            Iteration 8:   log likelihood = -2.7725894  
            Iteration 9:   log likelihood = -2.7725889  
            Iteration 10:  log likelihood = -2.7725888  
            
            Logistic regression                             Number of obs     =          5
                                                            LR chi2(2)        =       1.18
                                                            Prob > chi2       =     0.5530
            Log likelihood = -2.7725888                     Pseudo R2         =     0.1761
            
            ------------------------------------------------------------------------------
                   event |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                    time |
                      1  |          0  (empty)
                      2  |  -17.69858   6970.075    -0.00   0.998    -13678.79     13643.4
                      3  |          0  (empty)
                      4  |          0  (omitted)
                      5  |          0  (empty)
                         |
                    edu2 |
                    low  |   17.69858   6970.075     0.00   0.998     -13643.4    13678.79
                   _cons |  -.0001774   1.414214    -0.00   1.000    -2.771985     2.77163
            ------------------------------------------------------------------------------

            Comment


            • #7
              Chen Samulsion should I run stset again with new time you modified? and should not time be continuous vars in logit model? I mean be c.time?
              Last edited by samaneh khaef; 13 Feb 2025, 01:21.

              Comment


              • #8
                Stephen Jenkins thanks. I am following individuals from their arrival up until 10 years after. is not it continuous time modelling?

                Comment


                • #9
                  If I understand you correctly your survival time variable is measured in years, a discrete integer. So, your survival time variable is an 'interval censored' realisation of what might, in terms of the underlying process, occur in continuois time. (Event may actually occur within years, but you can't observe exact dates of events.) In many circumstances, modelling continous time survival processes taking account of the interval censoring of the observed survival times leads to models that are the same as treating survival times as intrinsically discrete (not simply observed thus). It's up to you as researcher to decide the approach to take -- you have knowledge about the underlying process (when events may occur along the time line) and also about the measurement of survival times (exact dates, interval censoring, etc.). The literature, and Stata manuals, can tell you more about modelling continuous time processes with interval censored data. (My website is, as I said, only a 'gentle introduction'.)

                  Comment

                  Working...
                  X