Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with validity test for survival model with Weibull distribution using generalized gamma model

    Hello,

    I am doing a survival analysis for when a vacant plot becomes developed, i.e., failure = development. I'm using a parametric model instead of Cox proportional hazards model, because the assumption of proportional hazards is not met. From parametric tests, I chose Weibull because it has a better fit - AIC and BIC - than Exponential (Gompertz is slightly better but I don't know how to assess its validity). To test the validity of the Weibull model, I fit a generalized gamma model and test the hypothesis that k=0 (test for the appropriateness of the lognormal) and then test the hypothesis that k=1 (test for the appropriateness of the Weibull). This is what is suggested in the Stata manual in streg—Parametric survival models. However, so far when I run the generalized gamma model, Stata takes too long to process the command. I have left it running already for two hours and nothing happens. It's strange, because for all other models (other distributions) all goes smoothly. The only thing different from the other models is that I add the nolog option (see code below), so that may have something to do.

    Therefore my questions are:

    1. What could be the reason it is taking so long? Could having added the nolog option be the reason? In cases like this in which Stata takes forever, is there any point in waiting for it to respond?

    2. Is there an alternative (faster) way of testing the validity of the Weibull model?

    3. Is there are a way to test the validity of the Gompertz model?



    I'm using stata 17. Please see below, details about the data, code, and output:


    First details about the data:

    Code:
    input float(YEAR_TAXROLL ln_JST_VAL_W_I_P FAILURE ln_HEAT_AR_W Bedrooms Restrooms Stories ln_SQFT_W HH_INCOME DIST_CBD RESDU_3_VAR interact_DIST_UNCERT3) int ZCTA
    2016 11.173612 0 7.305188 2 0 1  9.677214 47536  103287.6  .05121636  5290.015 32220
    2016  11.37649 0 7.021084 3 0 1  9.408371 47536 103184.12  .04664454  4812.976 32220
    2016  10.97931 0 6.927558 3 0 1  9.525151 47536 103415.26 .017120913 1770.5636 32220
    2016  11.49621 0 7.825245 3 0 2  9.525151 47536 103515.03 .027632317   2860.36 32220
    2016 11.246445 0 7.459915 3 0 1  9.525151 47536  103614.8  .02384571  2470.769 32220
    2016  11.26819 0 7.389564 3 0 1  9.525151 47536 103624.84 .006640271   688.097 32220
    2016  10.69538 0 6.579251 1 0 1  9.525151 47536 103525.06  .03760462  3893.021 32220
    2016  11.23799 0 7.266828 3 0 1  9.525151 47536  103425.3  .03764908  3893.867 32220
    2016  11.57281 0  7.53583 3 0 1 10.011175 47536  103841.3  .07288396  7568.366 32220
    2016 10.996836 0 7.313221 2 0 1  9.262268 47536  103538.8  .15010385 15541.572 32220
    
    . summarize ln_JST_VAL_W_I_P FAILURE ln_HEAT_AR_W Bedrooms Restrooms Stories ln_SQFT_W
    > HH_INCOME DIST_CBD RESDU_3_VAR interact_DIST_UNCERT3 ZCTA
    
        Variable |        Obs        Mean    Std. dev.       Min        Max
    -------------+---------------------------------------------------------
    ln_JST_VAL~P |  2,635,969     11.6731     .726165   8.597553   16.58054
         FAILURE |  4,639,075    .0035779    .0597082          0          1
    ln_HEAT_AR_W |  3,511,930    7.394051    .3701783    6.52503   8.383662
        Bedrooms |  4,200,561    3.034418    1.363163          0        201
       Restrooms |  4,200,561    .1051492    1.586688          0        269
    -------------+---------------------------------------------------------
         Stories |  4,200,561    1.235384     34.9962          0      41408
       ln_SQFT_W |  4,639,057    9.317953    .9546204    4.60517   13.02817
       HH_INCOME |  4,639,075     51708.1    16154.66      15279      95819
        DIST_CBD |  4,639,075    43541.89    20941.55   86.80797   133622.5
     RESDU_3_VAR |  1,756,326    .0190573    .0393086   4.14e-14   1.832363
    -------------+---------------------------------------------------------
    interact_D~3 |  1,756,326    609.4353    1231.284   1.66e-09   70178.39
            ZCTA |  4,639,075    32226.56    18.96891      32205      32277


    Weibull Model:

    Code:
    . streg ln_HEAT_AR_W Bedrooms Restrooms Stories ln_SQFT_W HH_INCOME DIST_CBD RESDU_3_VA
    > R interact_DIST_UNCERT3 i.ZCTA, dist (weibull)
    
            Failure _d: FAILURE
      Analysis time _t: YEAR_TAXROLL
    
    Fitting constant-only model:
    Iteration 0:   log likelihood = -4096.4745
    Iteration 1:   log likelihood = -3654.7215
    Iteration 2:   log likelihood = -3211.7027
    Iteration 3:   log likelihood = -2765.4254
    Iteration 4:   log likelihood = -2311.6977
    Iteration 5:   log likelihood = -1848.9143
    Iteration 6:   log likelihood =   -1438.38
    Iteration 7:   log likelihood = -1284.9652
    Iteration 8:   log likelihood = -1277.2977
    Iteration 9:   log likelihood = -1277.2794
    Iteration 10:   log likelihood = -1277.2794
    
    Fitting full model:
    Iteration 0:   log likelihood = -1277.2794  
    Iteration 1:   log likelihood =  -948.8248  
    Iteration 2:   log likelihood = -642.79445  
    Iteration 3:   log likelihood = -504.70937  
    Iteration 4:   log likelihood = -499.37099  
    Iteration 5:   log likelihood = -499.15653  
    Iteration 6:   log likelihood = -499.11293  
    Iteration 7:   log likelihood = -499.10213  
    Iteration 8:   log likelihood = -499.09989  
    Iteration 9:   log likelihood = -499.09941  
    Iteration 10:  log likelihood =  -499.0993  
    Iteration 11:  log likelihood = -499.09927  
    
    Weibull PH regression
    
    No. of subjects =  1,756,326                         Number of obs = 1,756,326
    No. of failures =        441
    Time at risk    = 3541672827
                                                         LR chi2(35)   =   1556.36
    Log likelihood = -499.09927                          Prob > chi2   =    0.0000
    
    --------------------------------------------------------------------------------------
                      _t | Haz. ratio   Std. err.      z    P>|z|     [95% conf. interval]
    ---------------------+----------------------------------------------------------------
            ln_HEAT_AR_W |   8.614132   1.565881    11.85   0.000     6.032256    12.30108
                Bedrooms |   1.323708   .0438235     8.47   0.000     1.240543    1.412449
               Restrooms |   .0010985   1.637346    -0.00   0.996            0           .
                 Stories |   1.024752   .1179899     0.21   0.832     .8177327     1.28418
               ln_SQFT_W |   .4915562   .0500188    -6.98   0.000     .4026784    .6000507
               HH_INCOME |   .9998837   8.58e-06   -13.55   0.000     .9998669    .9999005
                DIST_CBD |   .9999556   8.21e-06    -5.41   0.000     .9999395    .9999717
             RESDU_3_VAR |   20.80433   8.271109     7.63   0.000     9.544302    45.34852
    interact_DIST_UNCE~3 |   1.000138   .0000112    12.24   0.000     1.000116     1.00016
                         |
                    ZCTA |
                  32206  |   .0359272   .0187434    -6.38   0.000     .0129225    .0998849
                  32207  |   1.109801   .3035395     0.38   0.703     .6492838    1.896948
                  32208  |    .455503   .1390679    -2.58   0.010     .2503883    .8286449
                  32209  |   .1186568   .0391461    -6.46   0.000     .0621545    .2265233
                  32210  |    1.75124   .4617911     2.12   0.034     1.044454    2.936312
                  32211  |   .5077825   .1761033    -1.95   0.051     .2573201    1.002032
                  32216  |   .8141964   .4028421    -0.42   0.678     .3087293    2.147239
                  32217  |   .5059987   .2037725    -1.69   0.091     .2298048     1.11414
                  32218  |   7.461173   2.960943     5.06   0.000      3.42776    16.24066
                  32219  |   6.19e-06   .0030522    -0.02   0.981            0           .
                  32220  |    33.8469   26.96108     4.42   0.000     7.103724    161.2693
                  32221  |   10.33397   6.930979     3.48   0.000     2.775669    38.47396
                  32222  |   .0000875   .0649352    -0.01   0.990            0           .
                  32223  |   88.55096   53.39508     7.44   0.000        27.16    288.7066
                  32224  |   48.09658   28.90272     6.45   0.000     14.81156    156.1807
                  32225  |   26.39208   13.05314     6.62   0.000      10.0111    69.57696
                  32226  |   535.6981   326.4417    10.31   0.000     162.2624     1768.57
                  32233  |   270.8516   142.8178    10.62   0.000     96.36071    761.3123
                  32244  |    1.76229   1.045261     0.96   0.339     .5510707    5.635695
                  32246  |   8.378658   3.517927     5.06   0.000     3.679447    19.07947
                  32250  |   2057.785   1188.296    13.21   0.000     663.5322    6381.723
                  32254  |   .2503607   .1082887    -3.20   0.001     .1072495    .5844361
                  32256  |   8.877869   5.157508     3.76   0.000     2.843229    27.72079
                  32257  |   9.945448   4.675599     4.89   0.000     3.957798    24.99166
                  32258  |   452.1762   294.3879     9.39   0.000     126.2221    1619.869
                  32277  |   .3610453   .1779629    -2.07   0.039     .1374029    .9486971
                         |
                   _cons |          0          0   -29.08   0.000            0           0
    ---------------------+----------------------------------------------------------------
                   /ln_p |    7.03125   .0344609   204.04   0.000     6.963708    7.098792
    ---------------------+----------------------------------------------------------------
                       p |   1131.444   38.99061                      1057.547    1210.504
                     1/p |   .0008838   .0000305                      .0008261    .0009456
    --------------------------------------------------------------------------------------
    Note: _cons estimates baseline hazard.
    
    . //assess the fit of model with AIC
    . estat ic
    
    Akaike's information criterion and Bayesian information criterion
    
    -----------------------------------------------------------------------------
           Model |          N   ll(null)  ll(model)      df        AIC        BIC
    -------------+---------------------------------------------------------------
               . |  1,756,326  -1277.279  -499.0993      37   1072.199   1530.212
    -----------------------------------------------------------------------------
    Note: BIC uses N = number of observations. See [R] BIC note.


    And lastly, the generalized gamma model (this is the point where Stata stops working or at least takes too long processing) and Wald test:

    Code:
    HH_INCOME DIST_CBD RESDU_3_VAR interact_DIST_UNCERT3 i.ZCTA, dist (ggamma) nolog
    test [kappa]_cons = 1

    Again, my questions are:

    1. What could be the reason it is taking so long? Could having added the nolog option be the reason? In cases like this in which Stata takes forever, is there any point in waiting for it to respond?

    2. Is there an alternative (faster) way of testing the validity of the Weibull model?

    3. Is there are a way to test the validity of the Gompertz model?



    Thank you in advance!
    Last edited by Pedro Castro; 30 Jul 2023, 04:11.
Working...
X