Dear Statalist gurus,
We're having a problem that may be more of a statistics issue than a Stata issue. We're using Stata 16 on Windows.
Please excuse us if we have missed any important guideline for posting and let us know how we can help you make sense of our problem.
We are PhD students from Germany and are exploring how the use of certain stylistic characteristics, let's say e.g. sentiment, in questions, influence the length of the corresponding answers in the context of personality traits of the answering individual.
The data are structured as follows:
The data are clustered across several dimensions:
- There are several individuals who answer any number of questions across different conversations
- There are several individuals who ask any number of questions across different conversations
- The conversations take place in different settings and we are measuring the control variables at each conversation date
- The conversations take place at different points in time, where there are multiple Q-A-pairs within each conversation
We would like to run a regression that estimates the effect of the question sentiment on the length of the corresponding answer, and check the interaction between sent and extro.
We currently are working with a fixed effects model, and are fixing the individual who asks the question (for the reason that we have the least control variables available for that person).
We would like to use 2SLS to help us with the issue that sent might be endogenous. Let's say that we have 2 instruments for sent that are called instru1 and instru2.
- We would describe our data as an unbalanced panel with the additional issue of repeated measurements at each conversation date. Is that right?
- Which 2SLS estimator should we use to run the regression where qlength = sent##extro ctrls, where we want to instrument sent with instru1 and instru2.
- Does it make sense to use ivregress 2slsor xtivreg, fe? How would the command need to look like to tell STATA how to instrument the endogenous regressors in the context of the interaction term?
- Conceptually, are there other regression models that are more suitable to handle the structure of our data?
Yours,
Andy and Anna
We're having a problem that may be more of a statistics issue than a Stata issue. We're using Stata 16 on Windows.
Please excuse us if we have missed any important guideline for posting and let us know how we can help you make sense of our problem.
We are PhD students from Germany and are exploring how the use of certain stylistic characteristics, let's say e.g. sentiment, in questions, influence the length of the corresponding answers in the context of personality traits of the answering individual.
The data are structured as follows:
Length of answer (qlength) | Sentiment of question (sent) | Extroversion of individual answering (extro) | Control variables on individuals asking and answering (ctrls) | ... | Conversation date (date) |
12 | 0.5 | 2.5 | x | x | |
... | ... | ... | ... | ... | ... |
- There are several individuals who answer any number of questions across different conversations
- There are several individuals who ask any number of questions across different conversations
- The conversations take place in different settings and we are measuring the control variables at each conversation date
- The conversations take place at different points in time, where there are multiple Q-A-pairs within each conversation
We would like to run a regression that estimates the effect of the question sentiment on the length of the corresponding answer, and check the interaction between sent and extro.
We currently are working with a fixed effects model, and are fixing the individual who asks the question (for the reason that we have the least control variables available for that person).
We would like to use 2SLS to help us with the issue that sent might be endogenous. Let's say that we have 2 instruments for sent that are called instru1 and instru2.
- We would describe our data as an unbalanced panel with the additional issue of repeated measurements at each conversation date. Is that right?
- Which 2SLS estimator should we use to run the regression where qlength = sent##extro ctrls, where we want to instrument sent with instru1 and instru2.
- Does it make sense to use ivregress 2slsor xtivreg, fe? How would the command need to look like to tell STATA how to instrument the endogenous regressors in the context of the interaction term?
- Conceptually, are there other regression models that are more suitable to handle the structure of our data?
Yours,
Andy and Anna
Comment