Thoughts in using a serially correlated model

Fernando Martins

Join Date: Aug 2019
Posts: 27

Thoughts in using a serially correlated model

18 Dec 2023, 17:51

Hi,

I'm estimating a fertilizer demand model for Brazil, and to do so, I'm utilizing a panel dataset comprising 27 states over a span of 10 years.

Here's what I've gathered:

HTML Code:

. xtreg l(0/1).ln_fert ln_sb_barter i.big_farms#c.ln_sb_barter ln_wc_barter ln_area ln_credit i.big_farms#c.ln_credit i.year if year>=2013 & year<=2022, fe robust
note: 2021.year omitted because of collinearity
note: 2022.year omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =        270
Group variable: id                              Number of groups  =         27

R-sq:                                           Obs per group:
     within  = 0.5992                                         min =         10
     between = 0.6859                                         avg =       10.0
     overall = 0.6849                                         max =         10

                                                F(14,26)          =     118.64
corr(u_i, Xb)  = -0.4599                        Prob > F          =     0.0000

                                                (Std. Err. adjusted for 27 clusters in id)
------------------------------------------------------------------------------------------
                         |               Robust
                 ln_fert |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
                 ln_fert |
                     L1. |   .3984597   .0548065     7.27   0.000     .2858034    .5111161
                         |
            ln_sb_barter |  -.5074911   .0833269    -6.09   0.000    -.6787721   -.3362101
                         |
big_farms#c.ln_sb_barter |
                      1  |   .1151301   .0627154     1.84   0.078    -.0137831    .2440434
                         |
            ln_wc_barter |  -.2719184   .0614063    -4.43   0.000    -.3981409   -.1456959
                 ln_area |   .2986246   .0778739     3.83   0.001     .1385526    .4586966
               ln_credit |    .075654   .0220205     3.44   0.002     .0303903    .1209177
                         |
   big_farms#c.ln_credit |
                      1  |   .1360088   .0536807     2.53   0.018     .0256666    .2463511
                         |
                    year |
                   2014  |  -.1036127   .0257664    -4.02   0.000    -.1565763   -.0506491
                   2015  |   -.190189   .0383895    -4.95   0.000    -.2690997   -.1112783
                   2016  |  -.2760926   .0390131    -7.08   0.000    -.3562852      -.1959
                   2017  |  -.2534009   .0482792    -5.25   0.000    -.3526401   -.1541616
                   2018  |  -.1051378   .0281969    -3.73   0.001    -.1630972   -.0471783
                   2019  |  -.1708704   .0382155    -4.47   0.000    -.2494235   -.0923172
                   2020  |  -.2324278   .0281857    -8.25   0.000    -.2903643   -.1744913
                   2021  |          0  (omitted)
                   2022  |          0  (omitted)
                         |
                   _cons |   3.746749   1.025084     3.66   0.001     1.639659    5.853839
-------------------------+----------------------------------------------------------------
                 sigma_u |  1.3831548
                 sigma_e |  .13176596
                     rho |  .99100624   (fraction of variance due to u_i)
------------------------------------------------------------------------------------------

Now, the issues/considerations:

- I understand that for models that include the lagged dependent variable, opting for the Dynamic Panel approach is recommended. However, GMM performs poorly under a small N panel like mine. Also, with N being only 27, my options are limited when including instruments as they easily surpass the number of groups, resulting in highly inefficient estimates. The bias for using a lagged dependent variable in a FE model tends to zero as T grows, but I'm uncertain if my T is sufficiently large (which it probably isn't).

- I conducted a Wooldridge correlation test (using a simpler specification, as the -xtserial- command has issues in dealing with lag operators and factor variables) and confirmed that this model is serially correlated. In a previous thread addressing a similar issue, I saw Prof. Wooldridge mention that a FE model with Driscoll-Kraay errors might resolve this, as it accounts for serial correlation. However, as I intend to use this model for one-year-ahead forecasts the -xtscc- imposes a few challenges: i) the command is not compatible with various post-estimation commands, such as 'predict ln_fert_est, xbu', ii) it appears to disregard any 'if' statements I include, iii) creates colinearity problems in specifications that run perfectly under the xtreg command, and iv) is incompatible with iterations like '##'. These complications make the updating/forecasting process far more challenging than it should be and somewhat manual.

- The estimated coefficients derived from the FE model align with empirical observations and are quite comprehensible. Demand estimates are fairly accurate as well. Forecasts aren't as precise, but I suspect there might be one or two missing variables that are being captured by the year dummies.

Any recommendations? Should I stick to the FE model, or would you suggest an alternative approach?

Tags: None

George Ford

Join Date: Aug 2014

Posts: 3177
#2

18 Dec 2023, 18:19

Cluster on country to deal with the SC, which robust will do in xtreg. Some might complain about 27 being too few, but probably adequate. Could boottest if it bothers you.
Comment

Announcement

Thoughts in using a serially correlated model

Comment