Threshold Regression Model (Panel with Endogenous Threshold)

Pranav Garg

Join Date: Jul 2017

Posts: 41
#1

Threshold Regression Model (Panel with Endogenous Threshold)

08 Aug 2017, 09:22

Dear Community,

I am a user of STATA 13.
I am currently trying to estimate the impact of unemployment on the probability of re-election for a unbalanced panel of city-level elections held in 2002, 2008 and 2014. I am interested in conducting both non-parametric graphical analysis and threshold regression. I have consulted the following papers to get an understanding of Threshold Regression Modeling:
1. Structural Threshold Regression (Andros Kourtellos, Thanasis Stengos, Chih Ming Tan, 2011)
2. Sample Splitting and Threshold Regression (Hansen, 2000)

As I understand, STATA 13 provides a drop down menu which I have been using for non-parametric analysis.
However, I am trying to read and learn as much as I can to improve my estimations.
I would greatly appreciate inputs from any academician or researcher who has had prior experience at such a job, specifically how to best estimate the threshold value.

My econometric model is as follows:

Y_{city, year(t)}= B*unemployment_{city, year(t)} + A*(unemployment_{city, year(t)} - u_threshold) 1|(unemployment_{city, year(t)} > u_threshold)

where I want to estimate the unknown u_thresholdlevel of unemployment.

I shall update the post myself if I find a suitable solution.

Thank you,
Pranav
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35711
#2

08 Aug 2017, 10:39

Although the software evidently lets you add a flag to your title you're thereby saying "this post is especially important" and I have to advise against that unless it's obviously the case.
Comment
Pranav Garg

Join Date: Jul 2017

Posts: 41
#3

08 Aug 2017, 10:48

Dear Nick,

I understood it less as a flag claiming my post to be "important" and more as an indicator for this to be a problem that I have rather than a comment. If suggested, I shall keep my queries without any flag in the future.
Thanks for bringing it to my attention.

Best,
Pranav
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35711
#4

08 Aug 2017, 10:51

I think I see what you want to do. But almost all threads here start as problems. There are many posts that are announcements of meetings, courses, new programs, even a new version of Stata, etc., but perhaps less than 10% in total. So, "here is a problem" is the default situation.
Comment
Pranav Garg

Join Date: Jul 2017

Posts: 41
#5

08 Aug 2017, 10:58

Understood. I shall not flag it. Sir, I do not anticipate my problem to be a major one at all!
I was requesting a secondary guidance from anyone who has worked with such threshold regression.

I am independently reading on how to solve my issue myself (consulting codes provided by Bruce E. Hansen), and shall update my solution on my thread when I find one.
Comment
River Huang

Join Date: Mar 2016

Posts: 1908
#6

09 Aug 2017, 02:05

Please search xthreg, and see if this is what you want.

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
1 like
Comment

Pranav Garg

Join Date: Jul 2017
Posts: 41

14 Aug 2017, 06:15

Dear Sir,

I have read your presentation on this topic. It was very helpful in giving me theoretical clarity.
The single-threshold model described by Hansen is indeed what I want. I want to estimate the threshold unemployment level that causes a jump in reelection probability.
However, as I had written earlier, I have a discontinuous balanced panel (since it is election data with only two points, say 2007 and 2012).
I am unsure if xthreg works on a discontinuous dataframe.

I ran the following code. The error message in xthreg is described below.

Code:

. xtset circ_id YR_current
       panel variable:  circ_id (strongly balanced)
        time variable:  YR_current, 2007 to 2012, but with gaps
                delta:  1 unit

. xtdescribe 

 circ_id:  1, 2, ..., 555                                    n =        522
YR_current:  2007, 2012, ..., 2012                           T =          2
           Delta(YR_current) = 1 unit
           Span(YR_current)  = 6 periods
           (circ_id*YR_current uniquely identifies each observation)

Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                         2       2       2         2         2       2       2

     Freq.  Percent    Cum. |  Pattern*
 ---------------------------+----------
      522    100.00  100.00 |  11
 ---------------------------+----------
      522    100.00         |  XX
 --------------------------------------
 *Each column represents 5 periods.


. xthreg reelect_nom unemp_current unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)
Estimating  the  threshold  parameters:   1st ......                 thest():  3200  conformability error
                thestm():     -  function returned error
                 <istmt>:     -  function returned error
r(3200);

It could be a very simple error on my part.
I thank you for your time.

Best,
Pranav

Comment

River Huang

Join Date: Mar 2016

Posts: 1908
#8

14 Aug 2017, 18:08

Please drop `unemp_current' from the linear part.

Code:

xthreg reelect nom unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
1 like
Comment
Pranav Garg

Join Date: Jul 2017

Posts: 41
#9

15 Aug 2017, 10:31

Thank you! It works well.
Comment

Pranav Garg

Join Date: Jul 2017
Posts: 41

#10

20 Aug 2017, 17:01

Dear Prof. Huang,

This is a somewhat delayed response to the threshold model I have been trying to estimate.
As I had mentioned earlier, I am trying to test if there is a threshold effect (a jump or slope change) at a certain level of current unemployment (controlling for previous year's unemployment) in the municipality on the reelection probability of the representative. For eg, below 9% for instance, one could expect that the slope become steeper and the unemployment and reelection relation becomes stronger.

I am currently wondering if there is such an effect in my data in the first place, and if so what is the best method to test for it?

I have made the following attempts:
1) Having read the paper "Sample Splitting and Threshold Effect" by Hansen (2000), I used the code thresholdtest from his data and got the following result:

Code:

thresholdtest reelect_nom unemp_current unemp_prev, q(unemp_current) trim_per(0.10) rep(5000)
graph rename test_thershold

/*
Test of Null of No Threshold Against Alternative of Threshold
Allowing Heteroskedastic Errors (White Corrected)
 
______________________________________________________________________
Number of Bootstrap Replications:  5000
Trimming Percentage:               .1
 
Threshold Estimate:               .068385556
LM-test for no threshold:         10.8346427
Bootstrap P-Value:                .193
______________________________________________________________________

Given the p-value of 0.193, we cannot reject the null hypothesis of no threshold effect?
*/

The Bootstrap P-Value test seems to suggest no threshold effect. But I saw how Hansen treated his result (you may refer page 13 of his paper) and even he proceeds to say there is a threshold effect even at 0.9 P-Value.
I tried to read elsewhere, but got confused about how to interpret this P-Value and what the LM test value signifies.

Could you kindly give me some clarity, or what test to further conduct?

2) I broke the current unemployment (which goes from 4% to 17%) into brackets of 1 % and created 14 dummies and regressed on them to see which was the most significant. The result is as follows:

Code:

gen unemp_cat = string(floor(100*unemp_current)/100, "%03.2f")
encode unemp_cat, gen(unemp_cat1)

tabulate unemp_cat1, generate(cat)

reg reelect_nom cat*, robust

/*
note: cat13 omitted because of collinearity

Linear regression                                      Number of obs =    1077
                                                       F( 12,  1063) =       .
                                                       Prob > F      =       .
                                                       R-squared     =  0.0262
                                                       Root MSE      =  .48017

------------------------------------------------------------------------------
             |               Robust
 reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        cat1 |   .8333333    .108289     7.70   0.000     .6208489    1.045818       cat1 is 4%
        cat2 |   .6285714   .0581309    10.81   0.000     .5145071    .7426358
        cat3 |   .7313433   .0385432    18.97   0.000     .6557138    .8069728
        cat4 |   .6568627   .0334578    19.63   0.000     .5912119    .7225136
        cat5 |   .6227273   .0328932    18.93   0.000     .5581842    .6872703
        cat6 |   .5714286   .0384308    14.87   0.000     .4960198    .6468374
        cat7 |    .612069   .0455396    13.44   0.000     .5227112    .7014268
        cat8 |   .7142857   .0572892    12.47   0.000     .6018729    .8266985   which would mean 11%
        cat9 |   .4615385   .0695858     6.63   0.000     .3249973    .5980797
       cat10 |   .4736842   .1153007     4.11   0.000     .2474413    .6999271
       cat11 |   .3571429   .1289007     2.77   0.006     .1042141    .6100717
       cat12 |   .6666667   .2739519     2.43   0.015     .1291187    1.204215
       cat13 |          0  (omitted)
       cat14 |          1          .        .       .            .           .
       _cons |  -1.37e-12          .        .       .            .           .
------------------------------------------------------------------------------
*/

Based on this, I concluded that 11% is the first jump (from the 0.40 range to 0.60+ range) and thus could be threshold.
But I am not convinced since I have no explanation as to why not cat3, by the same rationale.

3) Finally, the result of xthreg:

Code:

xthreg reelect_nom unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)

/*
Estimating  the  threshold  parameters:   1st ......  Done

Threshold estimator (level = 95):
-----------------------------------------------------
     model |    Threshold         Lower         Upper
-----------+-----------------------------------------
      Th-1 |       0.0926        0.0912        0.0927
-----------------------------------------------------

Fixed-effects (within) regression               Number of obs      =      1044
Group variable: circ_id                         Number of groups   =       522

R-sq:  within  = 0.0563                         Obs per group: min =         2
       between = 0.0000                                        avg =       2.0
       overall = 0.0233                                        max =         2

                                                F(3,519)           =     10.32
corr(u_i, Xb)  = -0.0998                        Prob > F           =    0.0000

--------------------------------------------------------------------------------------
         reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
          unemp_prev |   29.67854   9.679224     3.07   0.002     10.66326    48.69381
                     |
_cat#c.unemp_current |
                  0  |  -34.69623   7.974088    -4.35   0.000    -50.36169   -19.03077
                  1  |  -33.06385   7.841573    -4.22   0.000    -48.46898   -17.65873
                     |
               _cons |   1.050465   .2701395     3.89   0.000     .5197636    1.581166
---------------------+----------------------------------------------------------------
             sigma_u |  .31632067
             sigma_e |  .51170116
                 rho |  .27648422   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
F test that all u_i=0:     F(521, 519) =     0.75            Prob > F = 0.9994
*/

If I infer correctly, these are very small and could suggest no threshold effect. What are your views, Sir?

To conclude : I do not want to impose a threshold if there is not one. I am still learning these topics, which I encountered for the first time 2 weeks ago, and I would not want to infer something wrong.

I understand my problem might never fully reach you due to absence of data for you and you will have to take a very raw guess based on my regression results.
I understand if you are unable to understand my data and help me subsequently.

But, if there are any inputs as to how to test for thresholds and comments of my inferences, they would be greatly appreciated.

Thank you,
Pranav

Comment

River Huang

Join Date: Mar 2016
Posts: 1908

#11

20 Aug 2017, 19:49

1. There does not exist a threshold effect as the null hypothesis of linearity can not be rejected. 2. I don't understand what you doing! 3. You are missing something

Code:

. // single threshold model: bootstrap iterations = 100 (saving time) 
. xthreg i q1 q2 q3 d1 qd1, rx(c1) qx(d1) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)

Code:

Threshold estimator (level = 95):
-----------------------------------------------------
     model |    Threshold         Lower         Upper
-----------+-----------------------------------------
      Th-1 |       0.0158        0.0141        0.0169
-----------------------------------------------------

Threshold effect test (bootstrap = 100):
-------------------------------------------------------------------------------
 Threshold |       RSS        MSE      Fstat    Prob   Crit10    Crit5    Crit1
-----------+-------------------------------------------------------------------
    Single |   17.7855     0.0023      33.56  0.0000  11.7556  14.4108  16.3428
-------------------------------------------------------------------------------

Fixed-effects (within) regression               Number of obs      =      7910
Group variable: id                              Number of groups   =       565

R-sq:  within  = 0.0949                         Obs per group: min =        14
       between = 0.0687                                        avg =      14.0
       overall = 0.0656                                        max =        14

                                                F(7,564)           =     45.95
corr(u_i, Xb)  = -0.3989                        Prob > F           =    0.0000

                                   (Std. Err. adjusted for 565 clusters in id)
------------------------------------------------------------------------------
             |               Robust
           i |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          q1 |   .0105597   .0019296     5.47   0.000     .0067697    .0143498
          q2 |    -.02033   .0054963    -3.70   0.000    -.0311258   -.0095342
          q3 |   .0010817   .0003509     3.08   0.002     .0003925    .0017709
          d1 |  -.0229306   .0056554    -4.05   0.000    -.0340388   -.0118224
         qd1 |   .0007571   .0024009     0.32   0.753    -.0039586    .0054728
             |
   _cat#c.c1 |
          0  |   .0556704   .0089264     6.24   0.000     .0381373    .0732036
          1  |   .0858644    .011852     7.24   0.000      .062585    .1091439
             |
       _cons |   .0628647   .0029855    21.06   0.000     .0570006    .0687287
-------------+----------------------------------------------------------------
     sigma_u |  .03985181
     sigma_e |  .04923164
         rho |  .39586195   (fraction of variance due to u_i)
------------------------------------------------------------------------------

In particular, the result

Code:

Threshold effect test (bootstrap = 100):

so that we are unable to conclude whether there exist threshold effects/

Ho-Chuan (River) Huang
Stata 19.0, MP(4)

Comment

Pranav Garg

Join Date: Jul 2017
Posts: 41

#12

21 Aug 2017, 03:28

Thank you Professor.

I ran the following code:

Code:

xthreg reelect_nom, rx(unemp_current) qx(unemp_current) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)
/*

Estimating  the  threshold  parameters:   1st ......  Done
Boostrap for single threshold
.................................................. +   50
.................................................. +  100

Threshold estimator (level = 95):
-----------------------------------------------------
     model |    Threshold         Lower         Upper
-----------+-----------------------------------------
      Th-1 |       0.0926        0.0912        0.0927
-----------------------------------------------------

Threshold effect test (bootstrap = 100):
-------------------------------------------------------------------------------
 Threshold |       RSS        MSE      Fstat    Prob   Crit10    Crit5    Crit1
-----------+-------------------------------------------------------------------
    Single |  138.1625     0.1326      10.00  0.4800  19.9782  22.6152  30.4260
-------------------------------------------------------------------------------

Fixed-effects (within) regression               Number of obs      =      1044
Group variable: circ_id                         Number of groups   =       522

R-sq:  within  = 0.0392                         Obs per group: min =         2
       between = 0.0021                                        avg =       2.0
       overall = 0.0082                                        max =         2

                                                F(2,521)           =     11.16
corr(u_i, Xb)  = -0.3796                        Prob > F           =    0.0000

                                      (Std. Err. adjusted for 522 clusters in circ_id)
--------------------------------------------------------------------------------------
                     |               Robust
         reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
_cat#c.unemp_current |
                  0  |  -11.57698   2.571778    -4.50   0.000    -16.62931   -6.524652
                  1  |  -9.931916   2.102775    -4.72   0.000    -14.06288   -5.800956
                     |
               _cons |   1.576532   .2054096     7.68   0.000     1.172999    1.980065
---------------------+----------------------------------------------------------------
             sigma_u |  .34220713
             sigma_e |  .51581838
                 rho |  .30561996   (fraction of variance due to u_i)
--------------------------------------------------------------------------------------
*/

I can conclude no threshold effect on this data. Is that correct?

2) was a very badly written explanation, my apologies.

Thank you for your time. I greatly appreciate being able to write to you on this directly.

Best,
Pranav

Comment

River Huang

Join Date: Mar 2016

Posts: 1908
#13

21 Aug 2017, 03:56

Well, there is no threshold (effect) in your regression. But you might want to try

Code:

xthreg reelect_nom, rx(unemp_prev) qx(unemp_prev) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)

Ho-Chuan (River) Huang
Stata 19.0, MP(4)
Comment
Pranav Garg

Join Date: Jul 2017

Posts: 41
#14

21 Aug 2017, 05:18

Thank you.

Pranav
Comment
Bui Thu Ha

Join Date: Oct 2017

Posts: 1
#15

10 Oct 2017, 01:03

Hi,

How can I allow for different intercept with xthreg?

Thanks
Comment

Announcement