No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTIVDFREG: new Stata command for instrumental variable estimation of large panel data models with common factors

    Together with Vasilis Sarafidis, I have released a new Stata package called xtivdfreg. The command implements a general instrumental variables approach for estimating large panel data models (large N and large T) with unobserved common factors or interactive effects, as developed by Norkute et al. (2020). The underlying idea of this approach is to project out the common factors from exogenous covariates using principal components analysis, and run IV regression using defactored covariates as instruments. The resulting "IVDF" method is valid for models with homogeneous or heterogeneous slope coefficients, and has several advantages relative to existing popular approaches (e.g. common correlated effects estimation). The algorithm accommodates unbalanced panel data and permits highly flexible instrumentation strategies.

    You can install the command from my personal website:
    net install xtivdfreg, from(
    The syntax and options are explained in the Stata help file:
    help xtivdfreg
    The help file also contains a few examples.

    For full details, see our accompanying article: Further reference:

  • #2
    Due to an issue with the way how Stata deals with interaction terms since Stata 15 (see, using interaction terms with xtivdfreg could result in unexpected error messages.

    A workaround is now implemented in the latest version 1.0.1.
    adoupdate xtivdfreg, update


    • #3
      With thanks to Kit Baum, the latest version 1.0.3 of the xtivdfreg command is now also available on SSC (in addition to my personal website):
      ssc install xtivdfreg
      Compared to earlier versions, this version has the new suboption fvar() for option iv(), which allows to extract factors from only a subset of the specified instrumental variables. This could for instance be useful if variables x and x2 are used as regressors/instruments, but the squared term should not be used for the factor extraction. Please see the help file for details.

      Our accompanying article was accepted for publication in the Stata Journal:


      • #4
        First, thank you all for continuously making applied work accessible and easily applicable.
        I do have a few questions about the xtivdfreg command (and sorry for the lengthy post): First, I noticed that the "mg" option doesn't report the J-test and the manual didn't say why. How do we then know for sure if after accounting for slope heterogeneity, the J-test may hold assuming under slope heterogeneity it didn't hold? Second, I noticed that the xtdcce2 has the option "full" that allows for the reporting of the individual estimates for panel units. Does the "xtivdfreg" have this feature? Third, I just want to make sure that the "xtivdfreg" accommodates both stationary and nonstationary variables for estimation. For example, if one uses nonstationary variables in a bivariate estimation, can one interpret the coefficient as a long-run parameter. Fourth, in the footnote, you do discuss that if no valid external instrument exists for an endogenous variable, one can include other informative exogeneous variables in the righthand side so that they and their lags can serve as valid instruments. If this is done, can the parameter of interest in the regression be interpreted as causal? Lastly, (and might have missed) if a righthand side variable (say X) is endogenous rather than exogeneous, does it just enter the regression as "xtivdfreg Y X, [options]" or there is a special way of inputting it.

        Thank you and I would really appreciate your response on these.



        • #5
          1. The J-test is not valid for the model with heterogeneous coefficients and therefore not reported. We briefly mention this in Section 2.2 of our article.
          2. There is no such option for xtivdfreg to show the individual coefficients from the MG estimation. I need to think about whether this would be a useful addition.
          3. Since the regressors are assumed to have a factor structure, and the factors are assumed to be stationary, I would conclude that the approach does not allow for nonstationary variables.
          4. There is nothing special about the interpretation of the coefficients. If the model is correctly specified and you have valid instruments, then you can interpret the coefficients as causal in the usual way.
          5. You still specify an endogenous regressor in the list of right-hand side variables but then would need to specify appropriate instruments with the iv() option.


          • #6
            Thank you so much for the prompt response. I really do appreciate it.