I'm running a threshold regression to figure out whether there are differential effects of my main variables of interest, contingent on specific values of a third variable
My code and results after a long wait are as follows
The size of region 2 is very small with only 13 variables out of 45,267 fitting in that region
I'm wondering how to make sense of this. To me this looks like a freaky accident of the data because it is hard to believe that the threshold variable has such massive effect in such a small area (I may be wrong but it would suggest a very small window of knowledge age (p_pria_usew) that provides disproportionate impact.
If this indeed reflects an oddity in the data, how do I best account for this when I run my full regression (which is a negative binomial regression on the same dependent variable?
My code and results after a long wait are as follows
Code:
threshold fwd, regionvars(log_other log_dom p_pria_timew) threshvar(p_pria_usew) optthresh(3) Number of obs = 45,267 Number of thresholds = 2 Max thresholds = 3 Threshold variable: p_pria_usew BIC = 2.489e+05 --------------------------------- Order Threshold SSR --------------------------------- 1 .08642547 1.105e+07 2 .08755191 1.103e+07 --------------------------------- ------------------------------------------------------------------------------ fwd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Region1 | log_other_t | -.1276248 .1263034 -1.01 0.312 -.375175 .1199254 log_dom_t | .3582348 .0696338 5.14 0.000 .2217552 .4947145 p_pria_timew | -1.060827 .0909772 -11.66 0.000 -1.239139 -.8825144 _cons | 9.529535 .1384599 68.83 0.000 9.258159 9.800912 -------------+---------------------------------------------------------------- Region2 | log_other_t | -5.488055 8.183094 -0.67 0.502 -21.52663 10.55052 log_dom_t | 33.02047 5.947951 5.55 0.000 21.3627 44.67824 p_pria_timew | -30.59203 8.530039 -3.59 0.000 -47.3106 -13.87346 _cons | -70.70322 15.63087 -4.52 0.000 -101.3392 -40.06729 -------------+---------------------------------------------------------------- Region3 | log_other_t | -1.062457 .2102138 -5.05 0.000 -1.474468 -.6504452 log_dom_t | 1.105572 .1141768 9.68 0.000 .8817893 1.329354 p_pria_timew | -1.159293 .1826239 -6.35 0.000 -1.517229 -.8013564 _cons | 10.2765 .2980277 34.48 0.000 9.69238 10.86063 ------------------------------------------------------------------------------
Code:
sum fwd if p_pria_usew >= .08642547 & p_pria_usew <= .08755191 Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- fwd | 13 30.61538 47.06297 1 146
If this indeed reflects an oddity in the data, how do I best account for this when I run my full regression (which is a negative binomial regression on the same dependent variable?