Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Threshold Regression Model (Panel with Endogenous Threshold)

    Dear Community,

    I am a user of STATA 13.
    I am currently trying to estimate the impact of unemployment on the probability of re-election for a unbalanced panel of city-level elections held in 2002, 2008 and 2014. I am interested in conducting both non-parametric graphical analysis and threshold regression. I have consulted the following papers to get an understanding of Threshold Regression Modeling:
    1. Structural Threshold Regression (Andros Kourtellos, Thanasis Stengos, Chih Ming Tan, 2011)
    2. Sample Splitting and Threshold Regression (Hansen, 2000)

    As I understand, STATA 13 provides a drop down menu which I have been using for non-parametric analysis.
    However, I am trying to read and learn as much as I can to improve my estimations.
    I would greatly appreciate inputs from any academician or researcher who has had prior experience at such a job, specifically how to best estimate the threshold value.

    My econometric model is as follows:

    Ycity, year(t) = B*unemploymentcity, year(t) + A*(unemploymentcity, year(t) - uthreshold) 1|(unemploymentcity, year(t) > uthreshold)
    where I want to estimate the unknown uthreshold level of unemployment.

    I shall update the post myself if I find a suitable solution.

    Thank you,
    Pranav



  • #2
    Although the software evidently lets you add a flag to your title you're thereby saying "this post is especially important" and I have to advise against that unless it's obviously the case.

    Comment


    • #3
      Dear Nick,

      I understood it less as a flag claiming my post to be "important" and more as an indicator for this to be a problem that I have rather than a comment. If suggested, I shall keep my queries without any flag in the future.
      Thanks for bringing it to my attention.

      Best,
      Pranav

      Comment


      • #4
        I think I see what you want to do. But almost all threads here start as problems. There are many posts that are announcements of meetings, courses, new programs, even a new version of Stata, etc., but perhaps less than 10% in total. So, "here is a problem" is the default situation.

        Comment


        • #5
          Understood. I shall not flag it. Sir, I do not anticipate my problem to be a major one at all!
          I was requesting a secondary guidance from anyone who has worked with such threshold regression.

          I am independently reading on how to solve my issue myself (consulting codes provided by Bruce E. Hansen), and shall update my solution on my thread when I find one.

          Comment


          • #6
            Please search xthreg, and see if this is what you want.

            Ho-Chuan (River) Huang
            Stata 17.0, MP(4)

            Comment


            • #7
              Dear Sir,

              I have read your presentation on this topic. It was very helpful in giving me theoretical clarity.
              The single-threshold model described by Hansen is indeed what I want. I want to estimate the threshold unemployment level that causes a jump in reelection probability.
              However, as I had written earlier, I have a discontinuous balanced panel (since it is election data with only two points, say 2007 and 2012).
              I am unsure if xthreg works on a discontinuous dataframe.

              I ran the following code. The error message in xthreg is described below.

              Code:
              . xtset circ_id YR_current
                     panel variable:  circ_id (strongly balanced)
                      time variable:  YR_current, 2007 to 2012, but with gaps
                              delta:  1 unit
              
              . xtdescribe 
              
               circ_id:  1, 2, ..., 555                                    n =        522
              YR_current:  2007, 2012, ..., 2012                           T =          2
                         Delta(YR_current) = 1 unit
                         Span(YR_current)  = 6 periods
                         (circ_id*YR_current uniquely identifies each observation)
              
              Distribution of T_i:   min      5%     25%       50%       75%     95%     max
                                       2       2       2         2         2       2       2
              
                   Freq.  Percent    Cum. |  Pattern*
               ---------------------------+----------
                    522    100.00  100.00 |  11
               ---------------------------+----------
                    522    100.00         |  XX
               --------------------------------------
               *Each column represents 5 periods.
              
              
              . xthreg reelect_nom unemp_current unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)
              Estimating  the  threshold  parameters:   1st ......                 thest():  3200  conformability error
                              thestm():     -  function returned error
                               <istmt>:     -  function returned error
              r(3200);

              It could be a very simple error on my part.
              I thank you for your time.

              Best,
              Pranav

              Comment


              • #8
                Please drop `unemp_current' from the linear part.
                Code:
                xthreg reelect nom unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)
                Ho-Chuan (River) Huang
                Stata 17.0, MP(4)

                Comment


                • #9
                  Thank you! It works well.

                  Comment


                  • #10
                    Dear Prof. Huang,

                    This is a somewhat delayed response to the threshold model I have been trying to estimate.
                    As I had mentioned earlier, I am trying to test if there is a threshold effect (a jump or slope change) at a certain level of current unemployment (controlling for previous year's unemployment) in the municipality on the reelection probability of the representative. For eg, below 9% for instance, one could expect that the slope become steeper and the unemployment and reelection relation becomes stronger.

                    I am currently wondering if there is such an effect in my data in the first place, and if so what is the best method to test for it?

                    I have made the following attempts:
                    1) Having read the paper "Sample Splitting and Threshold Effect" by Hansen (2000), I used the code thresholdtest from his data and got the following result:

                    Code:
                    thresholdtest reelect_nom unemp_current unemp_prev, q(unemp_current) trim_per(0.10) rep(5000)
                    graph rename test_thershold
                    
                    /*
                    Test of Null of No Threshold Against Alternative of Threshold
                    Allowing Heteroskedastic Errors (White Corrected)
                     
                    ______________________________________________________________________
                    Number of Bootstrap Replications:  5000
                    Trimming Percentage:               .1
                     
                    Threshold Estimate:               .068385556
                    LM-test for no threshold:         10.8346427
                    Bootstrap P-Value:                .193
                    ______________________________________________________________________
                    
                    Given the p-value of 0.193, we cannot reject the null hypothesis of no threshold effect?
                    */
                    The Bootstrap P-Value test seems to suggest no threshold effect. But I saw how Hansen treated his result (you may refer page 13 of his paper) and even he proceeds to say there is a threshold effect even at 0.9 P-Value.
                    I tried to read elsewhere, but got confused about how to interpret this P-Value and what the LM test value signifies.

                    Could you kindly give me some clarity, or what test to further conduct?



                    2) I broke the current unemployment (which goes from 4% to 17%) into brackets of 1 % and created 14 dummies and regressed on them to see which was the most significant. The result is as follows:

                    Code:
                    gen unemp_cat = string(floor(100*unemp_current)/100, "%03.2f")
                    encode unemp_cat, gen(unemp_cat1)
                    
                    tabulate unemp_cat1, generate(cat)
                    
                    reg reelect_nom cat*, robust
                    
                    /*
                    note: cat13 omitted because of collinearity
                    
                    Linear regression                                      Number of obs =    1077
                                                                           F( 12,  1063) =       .
                                                                           Prob > F      =       .
                                                                           R-squared     =  0.0262
                                                                           Root MSE      =  .48017
                    
                    ------------------------------------------------------------------------------
                                 |               Robust
                     reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                            cat1 |   .8333333    .108289     7.70   0.000     .6208489    1.045818       cat1 is 4%
                            cat2 |   .6285714   .0581309    10.81   0.000     .5145071    .7426358
                            cat3 |   .7313433   .0385432    18.97   0.000     .6557138    .8069728
                            cat4 |   .6568627   .0334578    19.63   0.000     .5912119    .7225136
                            cat5 |   .6227273   .0328932    18.93   0.000     .5581842    .6872703
                            cat6 |   .5714286   .0384308    14.87   0.000     .4960198    .6468374
                            cat7 |    .612069   .0455396    13.44   0.000     .5227112    .7014268
                            cat8 |   .7142857   .0572892    12.47   0.000     .6018729    .8266985   which would mean 11%
                            cat9 |   .4615385   .0695858     6.63   0.000     .3249973    .5980797
                           cat10 |   .4736842   .1153007     4.11   0.000     .2474413    .6999271
                           cat11 |   .3571429   .1289007     2.77   0.006     .1042141    .6100717
                           cat12 |   .6666667   .2739519     2.43   0.015     .1291187    1.204215
                           cat13 |          0  (omitted)
                           cat14 |          1          .        .       .            .           .
                           _cons |  -1.37e-12          .        .       .            .           .
                    ------------------------------------------------------------------------------
                    */
                    Based on this, I concluded that 11% is the first jump (from the 0.40 range to 0.60+ range) and thus could be threshold.
                    But I am not convinced since I have no explanation as to why not cat3, by the same rationale.


                    3) Finally, the result of xthreg:
                    Code:
                    xthreg reelect_nom unemp_prev, rx(unemp_current) qx(unemp_current) thnum(1) trim(0.05)
                    
                    /*
                    Estimating  the  threshold  parameters:   1st ......  Done
                    
                    Threshold estimator (level = 95):
                    -----------------------------------------------------
                         model |    Threshold         Lower         Upper
                    -----------+-----------------------------------------
                          Th-1 |       0.0926        0.0912        0.0927
                    -----------------------------------------------------
                    
                    Fixed-effects (within) regression               Number of obs      =      1044
                    Group variable: circ_id                         Number of groups   =       522
                    
                    R-sq:  within  = 0.0563                         Obs per group: min =         2
                           between = 0.0000                                        avg =       2.0
                           overall = 0.0233                                        max =         2
                    
                                                                    F(3,519)           =     10.32
                    corr(u_i, Xb)  = -0.0998                        Prob > F           =    0.0000
                    
                    --------------------------------------------------------------------------------------
                             reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    ---------------------+----------------------------------------------------------------
                              unemp_prev |   29.67854   9.679224     3.07   0.002     10.66326    48.69381
                                         |
                    _cat#c.unemp_current |
                                      0  |  -34.69623   7.974088    -4.35   0.000    -50.36169   -19.03077
                                      1  |  -33.06385   7.841573    -4.22   0.000    -48.46898   -17.65873
                                         |
                                   _cons |   1.050465   .2701395     3.89   0.000     .5197636    1.581166
                    ---------------------+----------------------------------------------------------------
                                 sigma_u |  .31632067
                                 sigma_e |  .51170116
                                     rho |  .27648422   (fraction of variance due to u_i)
                    --------------------------------------------------------------------------------------
                    F test that all u_i=0:     F(521, 519) =     0.75            Prob > F = 0.9994
                    */
                    If I infer correctly, these are very small and could suggest no threshold effect. What are your views, Sir?



                    To conclude : I do not want to impose a threshold if there is not one. I am still learning these topics, which I encountered for the first time 2 weeks ago, and I would not want to infer something wrong.

                    I understand my problem might never fully reach you due to absence of data for you and you will have to take a very raw guess based on my regression results.
                    I understand if you are unable to understand my data and help me subsequently.

                    But, if there are any inputs as to how to test for thresholds and comments of my inferences, they would be greatly appreciated.


                    Thank you,
                    Pranav

                    Comment


                    • #11
                      1. There does not exist a threshold effect as the null hypothesis of linearity can not be rejected. 2. I don't understand what you doing! 3. You are missing something
                      Code:
                      . // single threshold model: bootstrap iterations = 100 (saving time) 
                      . xthreg i q1 q2 q3 d1 qd1, rx(c1) qx(d1) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)
                      Code:
                      Threshold estimator (level = 95):
                      -----------------------------------------------------
                           model |    Threshold         Lower         Upper
                      -----------+-----------------------------------------
                            Th-1 |       0.0158        0.0141        0.0169
                      -----------------------------------------------------
                      
                      Threshold effect test (bootstrap = 100):
                      -------------------------------------------------------------------------------
                       Threshold |       RSS        MSE      Fstat    Prob   Crit10    Crit5    Crit1
                      -----------+-------------------------------------------------------------------
                          Single |   17.7855     0.0023      33.56  0.0000  11.7556  14.4108  16.3428
                      -------------------------------------------------------------------------------
                      
                      Fixed-effects (within) regression               Number of obs      =      7910
                      Group variable: id                              Number of groups   =       565
                      
                      R-sq:  within  = 0.0949                         Obs per group: min =        14
                             between = 0.0687                                        avg =      14.0
                             overall = 0.0656                                        max =        14
                      
                                                                      F(7,564)           =     45.95
                      corr(u_i, Xb)  = -0.3989                        Prob > F           =    0.0000
                      
                                                         (Std. Err. adjusted for 565 clusters in id)
                      ------------------------------------------------------------------------------
                                   |               Robust
                                 i |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                                q1 |   .0105597   .0019296     5.47   0.000     .0067697    .0143498
                                q2 |    -.02033   .0054963    -3.70   0.000    -.0311258   -.0095342
                                q3 |   .0010817   .0003509     3.08   0.002     .0003925    .0017709
                                d1 |  -.0229306   .0056554    -4.05   0.000    -.0340388   -.0118224
                               qd1 |   .0007571   .0024009     0.32   0.753    -.0039586    .0054728
                                   |
                         _cat#c.c1 |
                                0  |   .0556704   .0089264     6.24   0.000     .0381373    .0732036
                                1  |   .0858644    .011852     7.24   0.000      .062585    .1091439
                                   |
                             _cons |   .0628647   .0029855    21.06   0.000     .0570006    .0687287
                      -------------+----------------------------------------------------------------
                           sigma_u |  .03985181
                           sigma_e |  .04923164
                               rho |  .39586195   (fraction of variance due to u_i)
                      ------------------------------------------------------------------------------
                      In particular, the result
                      Code:
                      Threshold effect test (bootstrap = 100):
                      so that we are unable to conclude whether there exist threshold effects/
                      Ho-Chuan (River) Huang
                      Stata 17.0, MP(4)

                      Comment


                      • #12
                        Thank you Professor.

                        I ran the following code:

                        Code:
                        xthreg reelect_nom, rx(unemp_current) qx(unemp_current) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)
                        /*
                        
                        Estimating  the  threshold  parameters:   1st ......  Done
                        Boostrap for single threshold
                        .................................................. +   50
                        .................................................. +  100
                        
                        Threshold estimator (level = 95):
                        -----------------------------------------------------
                             model |    Threshold         Lower         Upper
                        -----------+-----------------------------------------
                              Th-1 |       0.0926        0.0912        0.0927
                        -----------------------------------------------------
                        
                        Threshold effect test (bootstrap = 100):
                        -------------------------------------------------------------------------------
                         Threshold |       RSS        MSE      Fstat    Prob   Crit10    Crit5    Crit1
                        -----------+-------------------------------------------------------------------
                            Single |  138.1625     0.1326      10.00  0.4800  19.9782  22.6152  30.4260
                        -------------------------------------------------------------------------------
                        
                        Fixed-effects (within) regression               Number of obs      =      1044
                        Group variable: circ_id                         Number of groups   =       522
                        
                        R-sq:  within  = 0.0392                         Obs per group: min =         2
                               between = 0.0021                                        avg =       2.0
                               overall = 0.0082                                        max =         2
                        
                                                                        F(2,521)           =     11.16
                        corr(u_i, Xb)  = -0.3796                        Prob > F           =    0.0000
                        
                                                              (Std. Err. adjusted for 522 clusters in circ_id)
                        --------------------------------------------------------------------------------------
                                             |               Robust
                                 reelect_nom |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                        ---------------------+----------------------------------------------------------------
                        _cat#c.unemp_current |
                                          0  |  -11.57698   2.571778    -4.50   0.000    -16.62931   -6.524652
                                          1  |  -9.931916   2.102775    -4.72   0.000    -14.06288   -5.800956
                                             |
                                       _cons |   1.576532   .2054096     7.68   0.000     1.172999    1.980065
                        ---------------------+----------------------------------------------------------------
                                     sigma_u |  .34220713
                                     sigma_e |  .51581838
                                         rho |  .30561996   (fraction of variance due to u_i)
                        --------------------------------------------------------------------------------------
                        */

                        I can conclude no threshold effect on this data. Is that correct?


                        2) was a very badly written explanation, my apologies.

                        Thank you for your time. I greatly appreciate being able to write to you on this directly.

                        Best,
                        Pranav

                        Comment


                        • #13
                          Well, there is no threshold (effect) in your regression. But you might want to try
                          Code:
                          xthreg reelect_nom, rx(unemp_prev) qx(unemp_prev) thnum(1) grid(400) trim(0.05) bs(100) vce(robust)
                          Ho-Chuan (River) Huang
                          Stata 17.0, MP(4)

                          Comment


                          • #14
                            Thank you.

                            Pranav

                            Comment


                            • #15
                              Hi,

                              How can I allow for different intercept with xthreg?

                              Thanks

                              Comment

                              Working...
                              X