Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting threshold effects when the isolated region is extremely small

    I'm running a threshold regression to figure out whether there are differential effects of my main variables of interest, contingent on specific values of a third variable
    My code and results after a long wait are as follows

    Code:
    threshold fwd, regionvars(log_other log_dom p_pria_timew) threshvar(p_pria_usew) optthresh(3)
    
                                                     Number of obs    =     45,267
    Number of thresholds =  2                        Max thresholds   =          3
    Threshold variable: p_pria_usew                  BIC              =  2.489e+05
    
    ---------------------------------
    Order     Threshold        SSR
    ---------------------------------
    1          .08642547    1.105e+07
    2          .08755191    1.103e+07
    ---------------------------------
    
    ------------------------------------------------------------------------------
             fwd |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    Region1      |
     log_other_t |  -.1276248   .1263034    -1.01   0.312     -.375175    .1199254
       log_dom_t |   .3582348   .0696338     5.14   0.000     .2217552    .4947145
    p_pria_timew |  -1.060827   .0909772   -11.66   0.000    -1.239139   -.8825144
           _cons |   9.529535   .1384599    68.83   0.000     9.258159    9.800912
    -------------+----------------------------------------------------------------
    Region2      |
     log_other_t |  -5.488055   8.183094    -0.67   0.502    -21.52663    10.55052
       log_dom_t |   33.02047   5.947951     5.55   0.000      21.3627    44.67824
    p_pria_timew |  -30.59203   8.530039    -3.59   0.000     -47.3106   -13.87346
           _cons |  -70.70322   15.63087    -4.52   0.000    -101.3392   -40.06729
    -------------+----------------------------------------------------------------
    Region3      |
     log_other_t |  -1.062457   .2102138    -5.05   0.000    -1.474468   -.6504452
       log_dom_t |   1.105572   .1141768     9.68   0.000     .8817893    1.329354
    p_pria_timew |  -1.159293   .1826239    -6.35   0.000    -1.517229   -.8013564
           _cons |    10.2765   .2980277    34.48   0.000      9.69238    10.86063
    ------------------------------------------------------------------------------
    The size of region 2 is very small with only 13 variables out of 45,267 fitting in that region

    Code:
    sum fwd if p_pria_usew >= .08642547  & p_pria_usew <= .08755191
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             fwd |         13    30.61538    47.06297          1        146
    I'm wondering how to make sense of this. To me this looks like a freaky accident of the data because it is hard to believe that the threshold variable has such massive effect in such a small area (I may be wrong but it would suggest a very small window of knowledge age (p_pria_usew) that provides disproportionate impact.
    If this indeed reflects an oddity in the data, how do I best account for this when I run my full regression (which is a negative binomial regression on the same dependent variable?
Working...
X