Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Thoughts in using a serially correlated model

    Hi,

    I'm estimating a fertilizer demand model for Brazil, and to do so, I'm utilizing a panel dataset comprising 27 states over a span of 10 years.

    Here's what I've gathered:

    HTML Code:
    . xtreg l(0/1).ln_fert ln_sb_barter i.big_farms#c.ln_sb_barter ln_wc_barter ln_area ln_credit i.big_farms#c.ln_credit i.year if year>=2013 & year<=2022, fe robust
    note: 2021.year omitted because of collinearity
    note: 2022.year omitted because of collinearity
    
    Fixed-effects (within) regression               Number of obs     =        270
    Group variable: id                              Number of groups  =         27
    
    R-sq:                                           Obs per group:
         within  = 0.5992                                         min =         10
         between = 0.6859                                         avg =       10.0
         overall = 0.6849                                         max =         10
    
                                                    F(14,26)          =     118.64
    corr(u_i, Xb)  = -0.4599                        Prob > F          =     0.0000
    
                                                    (Std. Err. adjusted for 27 clusters in id)
    ------------------------------------------------------------------------------------------
                             |               Robust
                     ln_fert |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
                     ln_fert |
                         L1. |   .3984597   .0548065     7.27   0.000     .2858034    .5111161
                             |
                ln_sb_barter |  -.5074911   .0833269    -6.09   0.000    -.6787721   -.3362101
                             |
    big_farms#c.ln_sb_barter |
                          1  |   .1151301   .0627154     1.84   0.078    -.0137831    .2440434
                             |
                ln_wc_barter |  -.2719184   .0614063    -4.43   0.000    -.3981409   -.1456959
                     ln_area |   .2986246   .0778739     3.83   0.001     .1385526    .4586966
                   ln_credit |    .075654   .0220205     3.44   0.002     .0303903    .1209177
                             |
       big_farms#c.ln_credit |
                          1  |   .1360088   .0536807     2.53   0.018     .0256666    .2463511
                             |
                        year |
                       2014  |  -.1036127   .0257664    -4.02   0.000    -.1565763   -.0506491
                       2015  |   -.190189   .0383895    -4.95   0.000    -.2690997   -.1112783
                       2016  |  -.2760926   .0390131    -7.08   0.000    -.3562852      -.1959
                       2017  |  -.2534009   .0482792    -5.25   0.000    -.3526401   -.1541616
                       2018  |  -.1051378   .0281969    -3.73   0.001    -.1630972   -.0471783
                       2019  |  -.1708704   .0382155    -4.47   0.000    -.2494235   -.0923172
                       2020  |  -.2324278   .0281857    -8.25   0.000    -.2903643   -.1744913
                       2021  |          0  (omitted)
                       2022  |          0  (omitted)
                             |
                       _cons |   3.746749   1.025084     3.66   0.001     1.639659    5.853839
    -------------------------+----------------------------------------------------------------
                     sigma_u |  1.3831548
                     sigma_e |  .13176596
                         rho |  .99100624   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------------
    Now, the issues/considerations:

    - I understand that for models that include the lagged dependent variable, opting for the Dynamic Panel approach is recommended. However, GMM performs poorly under a small N panel like mine. Also, with N being only 27, my options are limited when including instruments as they easily surpass the number of groups, resulting in highly inefficient estimates. The bias for using a lagged dependent variable in a FE model tends to zero as T grows, but I'm uncertain if my T is sufficiently large (which it probably isn't).

    - I conducted a Wooldridge correlation test (using a simpler specification, as the -xtserial- command has issues in dealing with lag operators and factor variables) and confirmed that this model is serially correlated. In a previous thread addressing a similar issue, I saw Prof. Wooldridge mention that a FE model with Driscoll-Kraay errors might resolve this, as it accounts for serial correlation. However, as I intend to use this model for one-year-ahead forecasts the -xtscc- imposes a few challenges: i) the command is not compatible with various post-estimation commands, such as 'predict ln_fert_est, xbu', ii) it appears to disregard any 'if' statements I include, iii) creates colinearity problems in specifications that run perfectly under the xtreg command, and iv) is incompatible with iterations like '##'. These complications make the updating/forecasting process far more challenging than it should be and somewhat manual.

    - The estimated coefficients derived from the FE model align with empirical observations and are quite comprehensible. Demand estimates are fairly accurate as well. Forecasts aren't as precise, but I suspect there might be one or two missing variables that are being captured by the year dummies.

    Any recommendations? Should I stick to the FE model, or would you suggest an alternative approach?

  • #2
    Cluster on country to deal with the SC, which robust will do in xtreg. Some might complain about 27 being too few, but probably adequate. Could boottest if it bothers you.

    Comment

    Working...
    X