No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTDPDQML: new Stata command for quasi-maximum likelihood estimation of linear dynamic panel models

    Dear Statalisters,

    I have developed a new Stata estimation command for quasi-maximum likelihood estimation of linear dynamic panel data models with a short time horizon, in particular the random-effects ML estimator by Bhargava and Sargan (1983) and the fixed-effects transformed ML estimator by Hsiao, Pesaran, and Tahmiscioglu (2002).

    The program can be installed by typing
    net from ""
    in Stata’s command window.

    A help file that documents the command syntax and the available options can be accessed by typing
    help xtdpdqml

    Postestimation commands can be used as well, see
    help xtdpdqml postestimation

    A preliminary background note is available at

    Comments and suggestions are welcome.

    - Bhargava, A. and J. D. Sargan (1983). Estimating Dynamic Random Effects Models from Panel Data Covering Short Time Periods. Econometrica 51 (6), 1635-1659.
    - Hsiao, C., M. H. Pesaran, and A. K. Tahmiscioglu (2002). Maximum likelihood estimation of fixed effects dynamic panel data models covering short time periods. Journal of Econometrics 109 (1), 107-150.

  • #2
    I just released an update of the xtdpdqml package with significant improvements, in particular regarding the random-effects estimation.

    The new version now has analytical gradients and Hessian matrices implemented for all model versions which yields large improvements in computational speed. Their implementation also helped to identify a mistake in the specification of the random-effects log-likelihood function that is corrected now. (Estimates will be different compared to previous versions.)

    For the random-effects model, the option noeffects has been added. If specified, the variance of the unobserved unit-specific effects is constrained to be zero which consequently yields identical coefficient estimates as with regress.

    The Stata help file now also contains some guidelines in the Remarks sections on how to deal with initial values that are not feasible for the maximization.

    For a fresh installation of the package type:
    net install xtdpdqml, from(
    To update an existing installation type:
    adoupdate xtdpdqml, update
    An updated background note is available on my website:
    Last edited by Sebastian Kripfganz; 07 Jun 2015, 08:22.


    • #3
      The latest update of the xtdpdqml command now also computes robust standard errors with the vce(robust) option for all model specifications. The robust VCE is computed along the usual ML lines with the sandwich formula.

      For the fixed-effects model, Hayakawa and Pesaran (2015) have shown that the transformed likelihood estimator is still consistent in the case of cross-sectional heteroskedasticity, and the robust VCE is a consistent estimator of the asymptotic variance-covariance matrix in this case:
      • Hayakawa, K., and M. H. Pesaran (2015). Robust standard errors in transformed likelihood estimation of dynamic panel data models with cross-sectional heteroskedasticity. Journal of Econometrics 188 (1), 111-134.


      • #4
        A new update is available with some bug fixes, in particular regarding the sample determination in the case of unbalanced panels with interior gaps. The available postestimation features have also been improved. In particular, it is now possible to use the suest command for a generalized Hausman specification test after having obtained fixed-effects and random-effects estimates with the xtdpdqml command. Here is an example:
        webuse abdata
        xtdpdqml n w k yr1978-yr1984, mlparams
        estimates store fe
        quietly xtdpdqml n w k yr1978-yr1984, re mlparams
        estimates store re
        suest fe re, vce(cluster id)
        test ([fe__model]LD.n = [re__model]L.n) ([fe__model]D.w = [re__model]w) ([fe__model]D.k = [re__model]k)
        The latest version of the xtdpdqml command now requires at least Stata version 12.1.


        • #5
          Can I apply the command for a model in first difference. I have some non-stationary variables and to obtain stationarity I applied FD. xtdpdqml already does the FD (you said on your site that the fixed effect one is in FD).

          xtdpdqml D.y1 D.x5 D.x6 D.x9 D.x10 D.x13 D.x15 D.x16 D.x17 D.x20 D.x23 D.x24 D.x26 D.x28 D.x33, re
          xtdpdqml y1 x5 x6 x9 x10 x13 x15 x16 x17 x20 x23 x24 x26 x28 x33, re

          Thank you!


          • #6
            I have to give the typical answer of an economist: It depends.

            In your first equation you are specifying a random-effects model in first differences. It is thus implicitly assumed that there was a linear random trend in the levels model since the trend becomes an intercept after differencing. In contrast, your second equation assumes a model with a random intercept (instead of a random trend) in levels.

            Secondly, for the estimation it is assumed that the error term of the specified equation does not exhibit serial correlation. Thus, with your first equation you are assuming that there is no serial correlation in the first-differenced errors while in your second equation your are assuming absence of serial correlation in the levels errors.

            With a dynamic model, non-stationarity of your observed variables should not be much of a concern. The relevant questions that you need to answer for your model choice are the two preceding ones about the unobserved model components. There is no general answer which of the two specifications should be preferred in your case.


            • #7
              There are not two models. Only one. But because some of my indep variables were non-stat., I choose to use the first difference. Afterwards I wrote the two commands in stata to see the results. The first one is with FD and the second is the one in which the variables are not differenced. This is why I asked you if your command already does the FD, because if it does than it means that i will compute a Diff of FD model which i don t want to do. Some authors state that if you have non-stat variables in your model, the best way is to continue with a FD methodology.

              Also in the help - you have the sta - assume stationarity of all variables.

              Thank you for your answer and help.


              • #8
                Sorry, I misinterpreted your first enquiry. The command automatically does a first-difference transformation in the fixed-effects case but not for the random-effects model.

                You do not necessarily have to have stationary variables. The other way round, if all variables are stationary you could get more efficient estimates with the stationary option.


                • #9
                  Thanks to Kit Baum, the xtdpdqml command is now also available from SSC (in addition to my own website, see above):
                  ssc install xtdpdqml
                  The latest version 1.3.1 comes with some minor bug fixes, additional display options, and improved help files.

                  I will present the package soon at the UK Stata Users Group Meeting in London.
                  • Kripfganz, S. (forthcoming). xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-T panel data models. Stata Journal (accepted manuscript).


                  • #10
                    Dear Sebastian,

                    I have used xtdpdqml to estimate linear DPD for my MRes project. I set up a model for banks as such:
                    y_it = y_i,t-1*lamda + x_it*Beta + z_it*Delta + u_i + e_it, where z_it is a vector of time-invariant variables, saying, unemployment rate,...
                    I am doubting on e_it may subject to AR1 because in my banking application, individuals' performance is highly persistent.

                    In your paper, the assumption of e_it should not be AR1 ones. So, could your estimator will deliver biased estimates in case of AR1 existence ? And, because xtdpdqml does not provide autocorrelation tests as xtabond2, how we can test AR in disturbances?

                    Hope to receive your reply! Thanks in advance.


                    • #11
                      I have notified error code like below:
                      Quasi-maximum likelihood estimation
                      initial values not feasible
                      when running xtdpdqml with re option but not fe. It did appear in STATA 13.1. Interestingly, when I run it in another computer this error was not replicated. I don't know why. So, please check your code for sure, Mr Sebastian! Thanks in advance.
                      Last edited by Binh Pham; 02 Sep 2016, 05:57.


                      • #12
                        Originally posted by Binh Pham View Post
                        In your paper, the assumption of e_it should not be AR1 ones. So, could your estimator will deliver biased estimates in case of AR1 existence ? And, because xtdpdqml does not provide autocorrelation tests as xtabond2, how we can test AR in disturbances?
                        Indeed, if there is remaining serial correlation in the error term, the estimates obtained with xtdpdqml will no longer be consistent. Tests for serial correlation might be added as a postestimation feature at some time in the future. For now, you could easily implement a test following the procedure outlined by Jeff Wooldridge in his textbook "Econometric Analysis of Cross Section and Panel Data", chapter 10.6.3. This is a test whether the first-differenced residuals exhibit a serial correlation of -0.5 which would correspond to a serially uncorrelated idiosyncratic error component in the original levels.
                        webuse abdata
                        xtdpdqml n w k yr1978-yr1984, vce(robust)
                        predict e, e
                        regress D.e LD.e, vce(robust)
                        test LD.e = -0.5
                        Alternatively, since predict with the option e gives you a prediction of the idiosyncratic error component excluding the fixed effects, you could directly test in a similar way for the absence of serial correlation in the levels:
                        regress e L.e, vce(robust)
                        test L.e

                        Concerning the infeasible initial values with the re option: This is not unlikely to happen for the random-effects model and it requires to specify alternative initial values for the variance parameters with the initval() option, e.g.
                        xtdpdqml n w k yr1978-yr1984, re vce(robust) initval(.1 .2 .2 .3)
                        This option specifies initial values for the variance parameters \(\sigma_u^2\), \(\sigma_e^2\), \(\sigma_0^2\), and \(\phi\) in this particular order. To be feasible, the following condition needs to be satisfied:
                        \[(\sigma_u^2 - \phi^2 \sigma_0^2) \max (T_i) > - \sigma_e^2\]
                        Please see the paper and the online appendix available on my webpage for details.

                        Regarding your different observations on two different computers, my guess is that you have different versions of xtdpdqml installed on those computers. There has been some smaller change to the computation of the default initial values for the variance parameters with the latest update, version 1.3.1, that can cause initial values that have been feasible before to be no longer feasible (or the other way round). Please make sure that you are running the latest version of xtdpdqml on both computers by checking its version and by updating it if necessary:
                        which xtdpdqml
                        adoupdate xtdpdqml, update


                        • #13
                          Thank you Sebastian very much. You explanations are very clear as it works now.
                          Instead, I set the initval = .1 .1 .1 .2 then get the identical outputs between two computer.


                          • #14
                            I should add that you should use the regress command with the vce(cluster id) instead of the vce(robust) option in the testing procedure that I have outlined in my previous post.

                            Alternatively, you might want to have a look at three new commands written by Jesse Wursten for serial correlation testing:
                            - xthrtest - now on SSC: Heteroskedasticity Robust LM(k) test for serial correlation in a fixed effect panel setting
                            - xtistest - now on SSC: Portmanteau IS test for serial correlation in a fixed effect panel setting
                            - xtqptest - now on SSC: Bias-corrected Q(P) test for serial correlation in a fixed effect panel setting

                            These commands should also work after xtdpdqml (without the mlparams option).


                            • #15
                              In my previous post, I have carelessly linked to other commands that perform serial correlation testing for panel data. The results of these test should not be relied on in the context of xtdpdqml. The reason is that these tests rely on a strict exogeneity assumption which is naturally not satisfied in dynamic models with a lagged dependent variable.

                              Thanks to Kit Baum, an update is now available on SSC (and on my personal website) to version 1.4.3 that fixes some minor bugs and introduces the postestimation command estat serial for use after xtdpdqml. This computes test statistics for the absence of serial correlation in the first-differenced residuals. The test is valid for dynamic models as estimated by xtdpdqml. In the absence of serial correlation in the idiosyncratic errors in levels, it is expected that there is first-order serial correlation in the first-differenced errors, while there should be no serial correlation of order 2 or higher.
                              adoupdate xtdpdqml, update
                              Last edited by Sebastian Kripfganz; 04 Mar 2017, 09:54.