No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to deal with endogenous control variables when implementing ivreg2h


    I am using Stata/MP 14.0, and I am referring to a SSC function "ivreg2h".

    I am running ivreg2h on a panel data, where the panel variable is listing_id, and the time variable is in month (35 months in total). The proposed relationship I am testing is an inverted U relationship between the emotion embedded in the Airbnb listing and the sales performance of the listing. Specifically, I am testing a model as follows:
    Sales = b1 * emo^2 + b2 * emo + X + FE(listing) + FE(month) + e
    where emo is the embedded emotion. emo^2 is the squared term of emo. emo is an endogenous variable, and so is emo^2. X is a vector of many endogenous control variables (about 10 variables) such as the price of the listing, the overall rating of the listing, etc. The reason I include those variables are that those control variables are time-variant, and cannot be captured by either listing fixed-effect or month fixed-effect.

    I am using ivreg2h because I was not able to identify valid external instrument variables, and therefore, I am using generated IV method brought up by Lewbel (2012). I have read many materials and examples (including "help ivreg2h"), but none of them could answer my question. I have three questions and I desribe each as follows:

    Q1: I am not sure whether should I include those endogenous control variables in my "ivreg2h" syntax. Here are two examples of the syntax:
    1. ivreg2h sales blocked ( emo_sqr emo =), fe
    2. ivreg2h sales blocked ( emo_sqr emo price rating availability =), fe
    where "blocked" is identified as an exogenous variable. In 1, only independent variables are included in the braces. In 2, independent variables, as well as many endogenous control variables, are included in the braces.

    Both options work fine, but my question is that I really don't know if the independent variables (in this case, "emo_sqr" and "emo" are the independent variables) are the only element that allowed to be included in the braces. If the control variables cannot be included in, where should I put them?

    Q2: A related question to Q1. ivreg2h does not support i.month, which means I need to manually include time dummies. The question is, where should I put the time dummies? In braces or outside of the braces?

    Q3: Will only one exogenous variable be sufficient in this case to generate IV?

    Any responses would be much appreciated!

    Best regards.

    1. Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics.