Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XTIVDFREG: new Stata command for instrumental variable estimation of large panel data models with common factors

    Together with Vasilis Sarafidis, I have released a new Stata package called xtivdfreg. The command implements a general instrumental variables approach for estimating large panel data models (large N and large T) with unobserved common factors or interactive effects, as developed by Norkute et al. (2020). The underlying idea of this approach is to project out the common factors from exogenous covariates using principal components analysis, and run IV regression using defactored covariates as instruments. The resulting "IVDF" method is valid for models with homogeneous or heterogeneous slope coefficients, and has several advantages relative to existing popular approaches (e.g. common correlated effects estimation). The algorithm accommodates unbalanced panel data and permits highly flexible instrumentation strategies.

    You can install the command from my personal website:
    Code:
    net install xtivdfreg, from(http://www.kripfganz.de/stata/)
    The syntax and options are explained in the Stata help file:
    Code:
    help xtivdfreg
    The help file also contains a few examples.

    For full details, see our accompanying article: Further reference:
    https://twitter.com/Kripfganz

  • #2
    Due to an issue with the way how Stata deals with interaction terms since Stata 15 (see https://www.statalist.org/forums/for...94#post1576494), using interaction terms with xtivdfreg could result in unexpected error messages.

    A workaround is now implemented in the latest version 1.0.1.
    Code:
    adoupdate xtivdfreg, update
    https://twitter.com/Kripfganz

    Comment


    • #3
      With thanks to Kit Baum, the latest version 1.0.3 of the xtivdfreg command is now also available on SSC (in addition to my personal website):
      Code:
      ssc install xtivdfreg
      Compared to earlier versions, this version has the new suboption fvar() for option iv(), which allows to extract factors from only a subset of the specified instrumental variables. This could for instance be useful if variables x and x2 are used as regressors/instruments, but the squared term should not be used for the factor extraction. Please see the help file for details.

      Our accompanying article was accepted for publication in the Stata Journal:
      https://twitter.com/Kripfganz

      Comment


      • #4
        First, thank you all for continuously making applied work accessible and easily applicable.
        I do have a few questions about the xtivdfreg command (and sorry for the lengthy post): First, I noticed that the "mg" option doesn't report the J-test and the manual didn't say why. How do we then know for sure if after accounting for slope heterogeneity, the J-test may hold assuming under slope heterogeneity it didn't hold? Second, I noticed that the xtdcce2 has the option "full" that allows for the reporting of the individual estimates for panel units. Does the "xtivdfreg" have this feature? Third, I just want to make sure that the "xtivdfreg" accommodates both stationary and nonstationary variables for estimation. For example, if one uses nonstationary variables in a bivariate estimation, can one interpret the coefficient as a long-run parameter. Fourth, in the footnote, you do discuss that if no valid external instrument exists for an endogenous variable, one can include other informative exogeneous variables in the righthand side so that they and their lags can serve as valid instruments. If this is done, can the parameter of interest in the regression be interpreted as causal? Lastly, (and might have missed) if a righthand side variable (say X) is endogenous rather than exogeneous, does it just enter the regression as "xtivdfreg Y X, [options]" or there is a special way of inputting it.

        Thank you and I would really appreciate your response on these.

        -John

        Comment


        • #5
          1. The J-test is not valid for the model with heterogeneous coefficients and therefore not reported. We briefly mention this in Section 2.2 of our article.
          2. There is no such option for xtivdfreg to show the individual coefficients from the MG estimation. I need to think about whether this would be a useful addition.
          3. Since the regressors are assumed to have a factor structure, and the factors are assumed to be stationary, I would conclude that the approach does not allow for nonstationary variables.
          4. There is nothing special about the interpretation of the coefficients. If the model is correctly specified and you have valid instruments, then you can interpret the coefficients as causal in the usual way.
          5. You still specify an endogenous regressor in the list of right-hand side variables but then would need to specify appropriate instruments with the iv() option.
          https://twitter.com/Kripfganz

          Comment


          • #6
            Thank you so much for the prompt response. I really do appreciate it.

            Comment


            • #7
              Hi Sebastian, I was wondering if it is possible to retrieve the residuals (which are free from the unobserved factors) after an xtivdfreg regression, I want to test whether the residuals from my model estimated with xtivdfreg are cross-sectionally independent compared to other estimators. Thanks for the help.

              Comment


              • #8
                Sorry, for all the questions but I do have an additional question. If X is endogenous and one has external instrument (Z), can one specify the iv() option to include both the external variable Z and the lags of the endogenous variable X (i.e., xtivdfreg Y X, iv(X, Z, lags(2)) factmax(N)?.or X shouldn't be included in iv()-- only Z should be included.. I ask this because my takeaway from reading the paper is that when some of the regressors are endogenous with respect to epsilon_it, extracting the principal components from those endogenous regressors can be invalid. I understand that the lags of the endogenous variable, by construction, are not endogenous with respect to epsilon_it; hence, they can be included in iv(). However, I just wanted to clarify if I am thinking about the treatment of the endogenous regressor case correctly. Thanks again!

                Comment


                • #9
                  Originally posted by John Francois View Post
                  I was wondering if it is possible to retrieve the residuals (which are free from the unobserved factors) after an xtivdfreg regression, I want to test whether the residuals from my model estimated with xtivdfreg are cross-sectionally independent compared to other estimators.
                  I am afraid this is not (currently) possible.

                  Originally posted by John Francois View Post
                  If X is endogenous and one has external instrument (Z), can one specify the iv() option to include both the external variable Z and the lags of the endogenous variable X (i.e., xtivdfreg Y X, iv(X, Z, lags(2)) factmax(N)?.or X shouldn't be included in iv()-- only Z should be included.. I ask this because my takeaway from reading the paper is that when some of the regressors are endogenous with respect to epsilon_it, extracting the principal components from those endogenous regressors can be invalid. I understand that the lags of the endogenous variable, by construction, are not endogenous with respect to epsilon_it; hence, they can be included in iv(). However, I just wanted to clarify if I am thinking about the treatment of the endogenous regressor case correctly.
                  Factor extraction is only valid from strictly exogenous variables, i.e. they must be uncorrelated with any future, current, and past errors. The lag of an endogenous variable X does not satisfy this condition and therefore should not be included in iv(). You would need to find a valid external instrument Z.
                  https://twitter.com/Kripfganz

                  Comment


                  • #10
                    Originally posted by Sebastian Kripfganz View Post
                    I am afraid this is not (currently) possible.


                    Factor extraction is only valid from strictly exogenous variables, i.e. they must be uncorrelated with any future, current, and past errors. The lag of an endogenous variable X does not satisfy this condition and therefore should not be included in iv(). You would need to find a valid external instrument Z.
                    Thanks again, Sebastian. I appreciate all the help.

                    Comment

                    Working...
                    X