Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimation Table

    I try to estimate a growth model with Markov-switching models with AR(1) with both expansion and recession states for multiple models with one step-ahead prediction. That is two states: recession and expansion. All variables in the model are endogenous with a single lag. The theoretical background is a model with a latent unit root (AR(1)).
    State1 is recession and state2 is expansion.

    While I can easily estimate the models, I cannot generate the publication-level table.
    I need to collect several statistics for each model, gathering them all in a publication-level table.

    I have four models to present.
    For the first model named Alternative 1,
    I run
    Code:
    mswitch ar  gdp_g var1, switch(var1) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    estat duration estat duration /// to calculate the average length of recession periods and expansion periods

    Var1 is labeled as alternative 1.


    For model 2, named Alternative 2,
    Code:
    mswitch ar gdp_g var2 ,switch(var2) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    
    
     
    estat duration /// to calculate the average length of recession periods and expansion periods
    Var2 is labeled as alternative 2.


    For model 3 named Alternative 3,
    Code:
    mswitch ar gdp_g var3 , switch(var3) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    
     
    estat duration estat duration /// to calculate the average length of recession periods and expansion periods
    ]Var3 is labeled as alternative 3.


    For model 4, named Alternative 4,
    Code:
    mswitch ar gdp_g var4 ,switch(var4) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    
     estat duration /// to calculate the average length of recession periods and expansion periods
    Var4 is labeled as alternative4

    So, I need to gather statistics to create a publication-level table with the following statistics:


    Those are

    0 (Recession Expected Duration,
    1 (Expansion Expected Duration, )
    2 (AR(1) Coefficient, )
    3 (AR(1) Standard Error, )
    4 (Log-Likelihood, )
    5 (AIC, )
    6 (BIC, )
    7 (Recession Mean, )
    8 (Recession Std Dev, )
    9 (Expansion Mean, )
    10 (Expansion Std Dev,)
    11 (Transition Matrix (Recession), )
    12 (Transition Matrix (Expansion),)
    13 (RMSE, )
    14 (R^2, )
    15 (Directional Accuracy,)
    16 (Recession Coefficient)
    17 (Recession Standard Error, )
    18 (Recession T-value, )
    19 (Recession P-value, )
    20 (Expansion Coefficient,)
    21 (Expansion Standard Error, )
    22 (Expansion T-value, )
    23 (Expansion P-value, )
    24 (Cov(p[0->0], p[0->0]), )
    25 (Cov(p[0->1], p[0->1]), )
    26 (Cov(p[1->1], p[1->1]), )
    27 (p[2->0], )
    28 (p[2->1], )
    29 (const[0], )
    30 (const[2], )
    31 (x1[0], )
    32 (x1[1],
    33 (sigma2, )
    34 (p[0->0], )
    35 (p[0->1], )
    36 (p[1->1], )



    Data sample is below..
    On the layout , Variables -models should be on the Columns and statistics on the rows.


    Thanks for any help!


    Mario

    Code:
    * Example generated by -dataex-. For more info, type help    dataex
    clear
    input float(year gdp_g var1 var2 var3 var4)
    1990 .25382978  .05228972  -.3444368 .27984852 .7201515
    1991 -.2125532    .935702   .3012117  .3284199 .6465201
    1992   .317234  4.1132245  3.1708634  .3083437  .691594
    1993 1.3534043   5.108476  2.5406225 .24354993 .7332776
    1994  2.612128   3.425388   1.768389 .23602992 .7400631
    1995  3.358936   3.972118  2.3609061 .27747253 .7225274
    1996 2.3597872   6.033151   4.661421   .325723  .674277
    1997 3.0414894   3.793532   3.347975 .30558395 .6757603
    1998  1.793617  3.7508705   3.940954 .29313186 .6818681
    1999 1.8717022   2.980336   3.343685 .28042582 .7138736
    2000  3.385532   2.734389   1.775652  .2688356 .7311644
    2001 2.0512767   .7556655    1.75466  .2554258 .7445742
    2002  2.913404   1.386648  2.0334747  .2105769  .789423
    2003  3.017021  1.5340753  1.0089021  .1955357 .8044643
    2004  3.321915    1.81795   .5531137  .2408904 .7591096
    2005 2.6668086 -.29726213  -.7515011  .2480769 .7519231
    2006  2.732766  -2.838431 -2.9404194  .2673077 .7326923
    2007 2.8740425 -1.5961655 -1.8205668 .26824456 .7317554
    2008 .04914893  -1.881141  -3.931049  .2303533 .7696467
    2009 -.9074468  -.8101007  -1.919522 .23055235 .7694477
    2010  2.887234 -2.2245317 -1.8109186 .29588655 .7041135
    2011 1.4131914  -1.431529 -1.4509507  .3009355 .6958717
    2012  .9474468 -2.6042926 -3.2060516 .25005552 .7229174
    2013 1.3274468  -.8640846 -2.6833265 .20782596 .7834868
    2014  .8512766  -.6784925  -2.421195 .22527473 .7747253
    end


    cc @Justin Niakamal @Scott Merryman @Jeff PitbladoEnable GingerCannot connect to Ginger Check your internet connection
    or reload the browserDisable in this text fieldRephraseRephrase current sentence2Edit in Ginger×Enable GingerCannot connect to Ginger Check your internet connection
    or reload the browserDisable in this text fieldRephraseRephrase current sentence2Edit in Ginger×
    Last edited by Mario Ferri; 07 Dec 2023, 08:02.

  • #2

    Not all the statistics you reference are produced by mswitch (that I
    can tell), and I do not understand your notation for others.

    For BIC, mswitch provides SBIC.

    You reference two transition matrices, and provide some notation for
    what look like transition probabilities (i.e. p[0->0]), but your
    notation does not line up with what is provided in the documentation. I
    see no mention of covariances between transition probabilities in the
    documentation or list of stored results.

    Neither RMSE nor R^2 are listed among the stored results.

    What do you mean by "Directional Accuracy"?

    What are const[0] and const[2]? I assume they are the
    intercepts of the state equations.

    What are x1[0] and x1[1]?

    I'm going to assume p[0->0], p[0->1], and p[1->1],
    are 3 of the four transition probabilities. The manual uses indices
    starting with 1 instead of 0.

    It the following I fit 3 of the 4 models, since that last one did not
    fit. Also note that I removed the variable from the switch()
    option since is it not allowed to be in both places and the models would
    not fit when placed in the option.

    Here is what I came up with given the data and models you provide.

    Code:
    clear all
    input float(year gdp_g var1 var2 var3 var4)
    1990 .25382978  .05228972  -.3444368 .27984852 .7201515
    1991 -.2125532    .935702   .3012117  .3284199 .6465201
    1992   .317234  4.1132245  3.1708634  .3083437  .691594
    1993 1.3534043   5.108476  2.5406225 .24354993 .7332776
    1994  2.612128   3.425388   1.768389 .23602992 .7400631
    1995  3.358936   3.972118  2.3609061 .27747253 .7225274
    1996 2.3597872   6.033151   4.661421   .325723  .674277
    1997 3.0414894   3.793532   3.347975 .30558395 .6757603
    1998  1.793617  3.7508705   3.940954 .29313186 .6818681
    1999 1.8717022   2.980336   3.343685 .28042582 .7138736
    2000  3.385532   2.734389   1.775652  .2688356 .7311644
    2001 2.0512767   .7556655    1.75466  .2554258 .7445742
    2002  2.913404   1.386648  2.0334747  .2105769  .789423
    2003  3.017021  1.5340753  1.0089021  .1955357 .8044643
    2004  3.321915    1.81795   .5531137  .2408904 .7591096
    2005 2.6668086 -.29726213  -.7515011  .2480769 .7519231
    2006  2.732766  -2.838431 -2.9404194  .2673077 .7326923
    2007 2.8740425 -1.5961655 -1.8205668 .26824456 .7317554
    2008 .04914893  -1.881141  -3.931049  .2303533 .7696467
    2009 -.9074468  -.8101007  -1.919522 .23055235 .7694477
    2010  2.887234 -2.2245317 -1.8109186 .29588655 .7041135
    2011 1.4131914  -1.431529 -1.4509507  .3009355 .6958717
    2012  .9474468 -2.6042926 -3.2060516 .25005552 .7229174
    2013 1.3274468  -.8640846 -2.6833265 .20782596 .7834868
    2014  .8512766  -.6784925  -2.421195 .22527473 .7747253
    end
    
    tsset year
    label var var1 "Alternative 1"
    label var var2 "Alternative 2"
    label var var3 "Alternative 3"
    label var var4 "Alternative 4"
    
    collect clear
    
    * collect prefix, use dimension -var[var1]- as an extra tag for these
    * estimation results since it automatically grabs the variable's label
    collect, tags(var[var1]): ///
        mswitch ar gdp_g var1, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    * list the estimated parameters
    matrix list e(b)
    * some transformations are made, so list r(table) to see how they are
    * labelled in the column names (-colname- dimension in -collect-)
    matrix list r(table)
    * get and tag the expected durations
    estat duration
    collect get ///
        state1_duration=(r(d1)) ///
        state2_duration=(r(d2)) ///
        , tags(var[var1])
    
    collect, tags(var[var2]): ///
        mswitch ar gdp_g var2, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    estat duration
    collect get ///
        state1_duration=(r(d1)) ///
        state2_duration=(r(d2)) ///
        , tags(var[var2])
    
    collect, tags(var[var3]): ///
        mswitch ar gdp_g var3, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    estat duration
    collect get ///
        state1_duration=(r(d1)) ///
        state2_duration=(r(d2)) ///
        , tags(var[var3])
    
    if (0) {
    // failed to fit
    collect, tags(var[var4]): ///
        mswitch ar gdp_g var4, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
    estat duration
    collect get ///
        state1_duration=(r(d1)) ///
        state2_duration=(r(d2)) ///
        , tags(var[var4])
    }
    
    * add some custom labels
    collect label levels result ///
        state1_duration "Recession Expected Druation" ///
        state2_duration "Expansion Expected Druation" ///
        , modify
    
    collect label levels coleq ///
        State1 "Recession" ///
        State2 "Expansion" ///
        , modify
    
    * build the layout, roughly in the order requested
    collect layout ///
        (    result[ ///
                state1_duration ///
                state2_duration ///
            ] ///
            coleq[State1 State2]#colname#result[_r_b _r_se] ///
            result[ll] ///
            result[aic] ///
            result[sbic] ///
            coleq[gdp_g]#result[_r_b _r_se _r_z _r_p] ///
            coleq[_diparm1]#colname[sigma2 p11 p21]#result[_r_b] ///
        ) ///
        (var)
    Here is the resulting table.
    Code:
    -----------------------------------------------------------------------
                                | Alternative 1 Alternative 2 Alternative 3
    ----------------------------+------------------------------------------
    Recession Expected Druation |      5.174398      3.262773      4.596756
    Expansion Expected Druation |      3.084882      1.154258       3.52197
    Recession                   |
      L.ar                      |
        Coefficient             |       1.34284      .8736163       1.36247
        Std. error              |      .2498725      .2210159       .306594
      Intercept                 |
        Coefficient             |      .3974298      1.645688      1.319421
        Std. error              |      .3516664      .3414381      .5831299
    Expansion                   |
      L.ar                      |
        Coefficient             |     -.2376953      .0004623      -.240308
        Std. error              |      .0481669      .0785129      .0873749
      Intercept                 |
        Coefficient             |      2.830824      3.051188      3.559341
        Std. error              |      .0528537       .109267      .4735527
    Log likelihood              |     -27.16072     -26.51169     -26.96121
    AIC                         |      3.013393      2.959307      2.996768
    SBIC                        |      3.455164      3.401077      3.438538
    gdp_g                       |
      Coefficient               |      .0380146       .080339     -2.752225
      Std. error                |       .021573      .0401036       1.81478
      z                         |          1.76          2.00         -1.52
      p-value                   |         0.078         0.045         0.129
    _diparm1                    |
      sigma2                    |
        Coefficient             |      .1805616      .1642792      .2439628
      p11                       |
        Coefficient             |      .8067408      .6935123      .7824553
      p21                       |
        Coefficient             |      .3241615      .8663578       .283932
    -----------------------------------------------------------------------
    See the documentation and examples in [TABLES] for additional
    table edits, such as hiding "Coefficient" via
    Code:
    . collect style header result[_r_b], level(hide)
    
    . collect preview
    
    -----------------------------------------------------------------------
                                | Alternative 1 Alternative 2 Alternative 3
    ----------------------------+------------------------------------------
    Recession Expected Druation |      5.174398      3.262773      4.596756
    Expansion Expected Druation |      3.084882      1.154258       3.52197
    Recession                   |
      L.ar                      |       1.34284      .8736163       1.36247
        Std. error              |      .2498725      .2210159       .306594
      Intercept                 |      .3974298      1.645688      1.319421
        Std. error              |      .3516664      .3414381      .5831299
    Expansion                   |
      L.ar                      |     -.2376953      .0004623      -.240308
        Std. error              |      .0481669      .0785129      .0873749
      Intercept                 |      2.830824      3.051188      3.559341
        Std. error              |      .0528537       .109267      .4735527
    Log likelihood              |     -27.16072     -26.51169     -26.96121
    AIC                         |      3.013393      2.959307      2.996768
    SBIC                        |      3.455164      3.401077      3.438538
    gdp_g                       |      .0380146       .080339     -2.752225
      Std. error                |       .021573      .0401036       1.81478
      z                         |          1.76          2.00         -1.52
      p-value                   |         0.078         0.045         0.129
    _diparm1                    |
      sigma2                    |      .1805616      .1642792      .2439628
      p11                       |      .8067408      .6935123      .7824553
      p21                       |      .3241615      .8663578       .283932
    -----------------------------------------------------------------------

    Comment


    • #3
      Thank you very much for your help, Jeff Pitblado (StataCorp)!.

      For the most part, the table is as I want it to be. There are some statistics that, as you pointed out, are not accurate in their definition.

      I would need in addition the following:

      In the model, there is provision of postestimation commands for postestimation prediction. So R-Square and RMSE are calculated and collected from there, I assume.

      Code:
      predict var*, smethod(filter)
      On the covariances, as far as I've seen, there's an estimation option to be collected from with:
      Code:
      estat vce
      ,

      from the generic postestimation commands that exist and are also applicable in this context,
      as well as
      Code:
      estat transition
      .

      Absence of covariance information is crucial for the framework information and potential correlations between these transition probabilities.


      Also, I would need these:

      Recession Mean: The mean value of the variable during recession periods.
      Recession Std Dev: The standard deviation of the variable during recession periods.
      Expansion Mean: The mean value of the variable during expansion periods.
      Expansion Std Dev: The standard deviation of the variable during expansion periods.


      On directional accuracy

      MDA measures how often the predicted direction of a time series matches the actual direction of the time series. To calculate MDA, you look at the signs of the differences between consecutive actual values and the signs of the differences between consecutive predicted values. If the signs are the same (i.e., both positive or both negative), that means the predicted direction matches the actual direction. You count how many times this happens, and divide by the total number of possible comparisons (which is one less than the length of the time series, because you can’t compare the first value to anything). This gives you the MDA value, which ranges from 0 to 1, with 1 indicating perfect directional accuracy.
      MDA = Number of times the signs of the differences between consecutive actual values are the same as the signs of the differences between consecutive predicted values / (N – 1)
      where N is the length of the time series.
      In mathematical notation, this can be expressed as:
      MDA = sum(i=2 to N) sign(actual[i] – actual[i-1]) * sign(predicted[i] – predicted[i-1]) / (N – 1)
      where sign(x) returns the sign of x (i.e., -1 if x < 0, 0 if x == 0, and 1 if x > 0).

      See

      https://datasciencestunt.com/mean-di...ries-forecast/

      It is postestimation evaluation. I did not find the way Stata does it.

      I guess need to be calculated in the model, probably by applying the above formula

      x1[0] and x1[1] can be ignored. For the rest you are correct to the interpretation

      It would also be nice if it could be possible to create in the dataset for each one of the models as time series data each regime's probabilities over time, each state residuals over time and each state predicted values over time for the next step of the analysis,

      The new feature of tables creation in Stata is great, but I’m still struggling with the tool.

      Many thank again. I would be grateful for any further help you could provide me.

      Best regards,

      Mario

      PS: Apologies for being late to thank you. I have been out of the office over the last few days.Enable GingerCannot connect to Ginger Check your internet connection
      or reload the browserDisable in this text fieldRephraseRephrase current sentence9Edit in Ginger×
      Last edited by Mario Ferri; 11 Dec 2023, 18:40.

      Comment

      Working...
      X