Estimation Table

Mario Ferri

Join Date: Jul 2019

Posts: 190
#1

Estimation Table

07 Dec 2023, 07:54

I try to estimate a growth model with Markov-switching models with AR(1) with both expansion and recession states for multiple models with one step-ahead prediction. That is two states: recession and expansion. All variables in the model are endogenous with a single lag. The theoretical background is a model with a latent unit root (AR(1)).
State1 is recession and state2 is expansion.

While I can easily estimate the models, I cannot generate the publication-level table.
I need to collect several statistics for each model, gathering them all in a publication-level table.

I have four models to present.
For the first model named Alternative 1,
I run

Code:

mswitch ar gdp_g var1, switch(var1) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000) estat duration estat duration /// to calculate the average length of recession periods and expansion periods

Var1 is labeled as alternative 1.

For model 2, named Alternative 2,

Code:

mswitch ar gdp_g var2 ,switch(var2) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000) estat duration /// to calculate the average length of recession periods and expansion periods

Var2 is labeled as alternative 2.

For model 3 named Alternative 3,

Code:

mswitch ar gdp_g var3 , switch(var3) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000) estat duration estat duration /// to calculate the average length of recession periods and expansion periods

]Var3 is labeled as alternative 3.

For model 4, named Alternative 4,

Code:

mswitch ar gdp_g var4 ,switch(var4) arswitch ar(1) varswitch difficult technique(dfp) iterate(5000) estat duration /// to calculate the average length of recession periods and expansion periods

Var4 is labeled as alternative4

So, I need to gather statistics to create a publication-level table with the following statistics:

Those are

0 (Recession Expected Duration,
1 (Expansion Expected Duration, )
2 (AR(1) Coefficient, )
3 (AR(1) Standard Error, )
4 (Log-Likelihood, )
5 (AIC, )
6 (BIC, )
7 (Recession Mean, )
8 (Recession Std Dev, )
9 (Expansion Mean, )
10 (Expansion Std Dev,)
11 (Transition Matrix (Recession), )
12 (Transition Matrix (Expansion),)
13 (RMSE, )
14 (R^2, )
15 (Directional Accuracy,)
16 (Recession Coefficient)
17 (Recession Standard Error, )
18 (Recession T-value, )
19 (Recession P-value, )
20 (Expansion Coefficient,)
21 (Expansion Standard Error, )
22 (Expansion T-value, )
23 (Expansion P-value, )
24 (Cov(p[0->0], p[0->0]), )
25 (Cov(p[0->1], p[0->1]), )
26 (Cov(p[1->1], p[1->1]), )
27 (p[2->0], )
28 (p[2->1], )
29 (const[0], )
30 (const[2], )
31 (x1[0], )
32 (x1[1],
33 (sigma2, )
34 (p[0->0], )
35 (p[0->1], )
36 (p[1->1], )

Data sample is below..
On the layout , Variables -models should be on the Columns and statistics on the rows.

Thanks for any help!

Mario

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(year gdp_g var1 var2 var3 var4) 1990 .25382978 .05228972 -.3444368 .27984852 .7201515 1991 -.2125532 .935702 .3012117 .3284199 .6465201 1992 .317234 4.1132245 3.1708634 .3083437 .691594 1993 1.3534043 5.108476 2.5406225 .24354993 .7332776 1994 2.612128 3.425388 1.768389 .23602992 .7400631 1995 3.358936 3.972118 2.3609061 .27747253 .7225274 1996 2.3597872 6.033151 4.661421 .325723 .674277 1997 3.0414894 3.793532 3.347975 .30558395 .6757603 1998 1.793617 3.7508705 3.940954 .29313186 .6818681 1999 1.8717022 2.980336 3.343685 .28042582 .7138736 2000 3.385532 2.734389 1.775652 .2688356 .7311644 2001 2.0512767 .7556655 1.75466 .2554258 .7445742 2002 2.913404 1.386648 2.0334747 .2105769 .789423 2003 3.017021 1.5340753 1.0089021 .1955357 .8044643 2004 3.321915 1.81795 .5531137 .2408904 .7591096 2005 2.6668086 -.29726213 -.7515011 .2480769 .7519231 2006 2.732766 -2.838431 -2.9404194 .2673077 .7326923 2007 2.8740425 -1.5961655 -1.8205668 .26824456 .7317554 2008 .04914893 -1.881141 -3.931049 .2303533 .7696467 2009 -.9074468 -.8101007 -1.919522 .23055235 .7694477 2010 2.887234 -2.2245317 -1.8109186 .29588655 .7041135 2011 1.4131914 -1.431529 -1.4509507 .3009355 .6958717 2012 .9474468 -2.6042926 -3.2060516 .25005552 .7229174 2013 1.3274468 -.8640846 -2.6833265 .20782596 .7834868 2014 .8512766 -.6784925 -2.421195 .22527473 .7747253 end

cc @Justin Niakamal @Scott Merryman @Jeff PitbladoEnable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence2Edit in Ginger×Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence2Edit in Ginger×

Last edited by Mario Ferri; 07 Dec 2023, 08:02.
Tags: None

Jeff Pitblado (StataCorp)

StataCorp Employee

Join Date: Mar 2014
Posts: 704

07 Dec 2023, 13:23

Not all the statistics you reference are produced by mswitch (that I
can tell), and I do not understand your notation for others.

For BIC, mswitch provides SBIC.

You reference two transition matrices, and provide some notation for
what look like transition probabilities (i.e. p[0->0]), but your
notation does not line up with what is provided in the documentation. I
see no mention of covariances between transition probabilities in the
documentation or list of stored results.

Neither RMSE nor R^2 are listed among the stored results.

What do you mean by "Directional Accuracy"?

What are const[0] and const[2]? I assume they are the
intercepts of the state equations.

What are x1[0] and x1[1]?

I'm going to assume p[0->0], p[0->1], and p[1->1],
are 3 of the four transition probabilities. The manual uses indices
starting with 1 instead of 0.

It the following I fit 3 of the 4 models, since that last one did not
fit. Also note that I removed the variable from the switch()
option since is it not allowed to be in both places and the models would
not fit when placed in the option.

Here is what I came up with given the data and models you provide.

Code:

clear all
input float(year gdp_g var1 var2 var3 var4)
1990 .25382978  .05228972  -.3444368 .27984852 .7201515
1991 -.2125532    .935702   .3012117  .3284199 .6465201
1992   .317234  4.1132245  3.1708634  .3083437  .691594
1993 1.3534043   5.108476  2.5406225 .24354993 .7332776
1994  2.612128   3.425388   1.768389 .23602992 .7400631
1995  3.358936   3.972118  2.3609061 .27747253 .7225274
1996 2.3597872   6.033151   4.661421   .325723  .674277
1997 3.0414894   3.793532   3.347975 .30558395 .6757603
1998  1.793617  3.7508705   3.940954 .29313186 .6818681
1999 1.8717022   2.980336   3.343685 .28042582 .7138736
2000  3.385532   2.734389   1.775652  .2688356 .7311644
2001 2.0512767   .7556655    1.75466  .2554258 .7445742
2002  2.913404   1.386648  2.0334747  .2105769  .789423
2003  3.017021  1.5340753  1.0089021  .1955357 .8044643
2004  3.321915    1.81795   .5531137  .2408904 .7591096
2005 2.6668086 -.29726213  -.7515011  .2480769 .7519231
2006  2.732766  -2.838431 -2.9404194  .2673077 .7326923
2007 2.8740425 -1.5961655 -1.8205668 .26824456 .7317554
2008 .04914893  -1.881141  -3.931049  .2303533 .7696467
2009 -.9074468  -.8101007  -1.919522 .23055235 .7694477
2010  2.887234 -2.2245317 -1.8109186 .29588655 .7041135
2011 1.4131914  -1.431529 -1.4509507  .3009355 .6958717
2012  .9474468 -2.6042926 -3.2060516 .25005552 .7229174
2013 1.3274468  -.8640846 -2.6833265 .20782596 .7834868
2014  .8512766  -.6784925  -2.421195 .22527473 .7747253
end

tsset year
label var var1 "Alternative 1"
label var var2 "Alternative 2"
label var var3 "Alternative 3"
label var var4 "Alternative 4"

collect clear

* collect prefix, use dimension -var[var1]- as an extra tag for these
* estimation results since it automatically grabs the variable's label
collect, tags(var[var1]): ///
    mswitch ar gdp_g var1, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
* list the estimated parameters
matrix list e(b)
* some transformations are made, so list r(table) to see how they are
* labelled in the column names (-colname- dimension in -collect-)
matrix list r(table)
* get and tag the expected durations
estat duration
collect get ///
    state1_duration=(r(d1)) ///
    state2_duration=(r(d2)) ///
    , tags(var[var1])

collect, tags(var[var2]): ///
    mswitch ar gdp_g var2, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
estat duration
collect get ///
    state1_duration=(r(d1)) ///
    state2_duration=(r(d2)) ///
    , tags(var[var2])

collect, tags(var[var3]): ///
    mswitch ar gdp_g var3, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
estat duration
collect get ///
    state1_duration=(r(d1)) ///
    state2_duration=(r(d2)) ///
    , tags(var[var3])

if (0) {
// failed to fit
collect, tags(var[var4]): ///
    mswitch ar gdp_g var4, arswitch ar(1) varswitch difficult technique(dfp) iterate(5000)
estat duration
collect get ///
    state1_duration=(r(d1)) ///
    state2_duration=(r(d2)) ///
    , tags(var[var4])
}

* add some custom labels
collect label levels result ///
    state1_duration "Recession Expected Druation" ///
    state2_duration "Expansion Expected Druation" ///
    , modify

collect label levels coleq ///
    State1 "Recession" ///
    State2 "Expansion" ///
    , modify

* build the layout, roughly in the order requested
collect layout ///
    (    result[ ///
            state1_duration ///
            state2_duration ///
        ] ///
        coleq[State1 State2]#colname#result[_r_b _r_se] ///
        result[ll] ///
        result[aic] ///
        result[sbic] ///
        coleq[gdp_g]#result[_r_b _r_se _r_z _r_p] ///
        coleq[_diparm1]#colname[sigma2 p11 p21]#result[_r_b] ///
    ) ///
    (var)

Here is the resulting table.

Code:

-----------------------------------------------------------------------
                            | Alternative 1 Alternative 2 Alternative 3
----------------------------+------------------------------------------
Recession Expected Druation |      5.174398      3.262773      4.596756
Expansion Expected Druation |      3.084882      1.154258       3.52197
Recession                   |
  L.ar                      |
    Coefficient             |       1.34284      .8736163       1.36247
    Std. error              |      .2498725      .2210159       .306594
  Intercept                 |
    Coefficient             |      .3974298      1.645688      1.319421
    Std. error              |      .3516664      .3414381      .5831299
Expansion                   |
  L.ar                      |
    Coefficient             |     -.2376953      .0004623      -.240308
    Std. error              |      .0481669      .0785129      .0873749
  Intercept                 |
    Coefficient             |      2.830824      3.051188      3.559341
    Std. error              |      .0528537       .109267      .4735527
Log likelihood              |     -27.16072     -26.51169     -26.96121
AIC                         |      3.013393      2.959307      2.996768
SBIC                        |      3.455164      3.401077      3.438538
gdp_g                       |
  Coefficient               |      .0380146       .080339     -2.752225
  Std. error                |       .021573      .0401036       1.81478
  z                         |          1.76          2.00         -1.52
  p-value                   |         0.078         0.045         0.129
_diparm1                    |
  sigma2                    |
    Coefficient             |      .1805616      .1642792      .2439628
  p11                       |
    Coefficient             |      .8067408      .6935123      .7824553
  p21                       |
    Coefficient             |      .3241615      .8663578       .283932
-----------------------------------------------------------------------

See the documentation and examples in [TABLES] for additional
table edits, such as hiding "Coefficient" via

Code:

. collect style header result[_r_b], level(hide)

. collect preview

-----------------------------------------------------------------------
                            | Alternative 1 Alternative 2 Alternative 3
----------------------------+------------------------------------------
Recession Expected Druation |      5.174398      3.262773      4.596756
Expansion Expected Druation |      3.084882      1.154258       3.52197
Recession                   |
  L.ar                      |       1.34284      .8736163       1.36247
    Std. error              |      .2498725      .2210159       .306594
  Intercept                 |      .3974298      1.645688      1.319421
    Std. error              |      .3516664      .3414381      .5831299
Expansion                   |
  L.ar                      |     -.2376953      .0004623      -.240308
    Std. error              |      .0481669      .0785129      .0873749
  Intercept                 |      2.830824      3.051188      3.559341
    Std. error              |      .0528537       .109267      .4735527
Log likelihood              |     -27.16072     -26.51169     -26.96121
AIC                         |      3.013393      2.959307      2.996768
SBIC                        |      3.455164      3.401077      3.438538
gdp_g                       |      .0380146       .080339     -2.752225
  Std. error                |       .021573      .0401036       1.81478
  z                         |          1.76          2.00         -1.52
  p-value                   |         0.078         0.045         0.129
_diparm1                    |
  sigma2                    |      .1805616      .1642792      .2439628
  p11                       |      .8067408      .6935123      .7824553
  p21                       |      .3241615      .8663578       .283932
-----------------------------------------------------------------------

Comment

Mario Ferri

Join Date: Jul 2019

Posts: 190
#3

11 Dec 2023, 18:09

Thank you very much for your help, Jeff Pitblado (StataCorp)!.

For the most part, the table is as I want it to be. There are some statistics that, as you pointed out, are not accurate in their definition.

I would need in addition the following:

In the model, there is provision of postestimation commands for postestimation prediction. So R-Square and RMSE are calculated and collected from there, I assume.

Code:

predict var*, smethod(filter)

On the covariances, as far as I've seen, there's an estimation option to be collected from with:

Code:

estat vce

,

from the generic postestimation commands that exist and are also applicable in this context,
as well as

Code:

estat transition

.

Absence of covariance information is crucial for the framework information and potential correlations between these transition probabilities.

Also, I would need these:

Recession Mean: The mean value of the variable during recession periods.
Recession Std Dev: The standard deviation of the variable during recession periods.
Expansion Mean: The mean value of the variable during expansion periods.
Expansion Std Dev: The standard deviation of the variable during expansion periods.

On directional accuracy

MDA measures how often the predicted direction of a time series matches the actual direction of the time series. To calculate MDA, you look at the signs of the differences between consecutive actual values and the signs of the differences between consecutive predicted values. If the signs are the same (i.e., both positive or both negative), that means the predicted direction matches the actual direction. You count how many times this happens, and divide by the total number of possible comparisons (which is one less than the length of the time series, because you can’t compare the first value to anything). This gives you the MDA value, which ranges from 0 to 1, with 1 indicating perfect directional accuracy.
MDA = Number of times the signs of the differences between consecutive actual values are the same as the signs of the differences between consecutive predicted values / (N – 1)
where N is the length of the time series.
In mathematical notation, this can be expressed as:
MDA = sum(i=2 to N) sign(actual[i] – actual[i-1]) * sign(predicted[i] – predicted[i-1]) / (N – 1)
where sign(x) returns the sign of x (i.e., -1 if x < 0, 0 if x == 0, and 1 if x > 0).

See

https://datasciencestunt.com/mean-di...ries-forecast/

It is postestimation evaluation. I did not find the way Stata does it.

I guess need to be calculated in the model, probably by applying the above formula

x1[0] and x1[1] can be ignored. For the rest you are correct to the interpretation

It would also be nice if it could be possible to create in the dataset for each one of the models as time series data each regime's probabilities over time, each state residuals over time and each state predicted values over time for the next step of the analysis,

The new feature of tables creation in Stata is great, but I’m still struggling with the tool.

Many thank again. I would be grateful for any further help you could provide me.

Best regards,

Mario

PS: Apologies for being late to thank you. I have been out of the office over the last few days.Enable GingerCannot connect to Ginger Check your internet connection
or reload the browserDisable in this text fieldRephraseRephrase current sentence9Edit in Ginger×

Last edited by Mario Ferri; 11 Dec 2023, 18:40.
Comment

Announcement

Comment

Comment