Diff-in-diff with panel data - Control variables

maykon feitosa

Join Date: May 2021

Posts: 5
#1

Diff-in-diff with panel data - Control variables

07 May 2021, 19:04

Good evening! I'm writing an academic paper and I pretend to build a diff-in-diff model using panel data. Stata is a new world for me, so I'm little bit confusing with some commands. The post in this link (https://www.statalist.org/forums/for...ith-panel-data) helped me a lot, but I have a simple question: Stata recognizes control variables automatically? For example, in the code below, Population, Unemployment and Avg_Month_Wage are recognized automatically as control variables?

Code: xtreg W_Trade_M Treat Post TreatPost Population Unemployment Avg_Month_Wage
Tags: diff-in-diff, panel data
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

07 May 2021, 20:08

For example, in the code below, Population, Unemployment and Avg_Month_Wage are recognized automatically as control variables?

I'm not sure what you mean by this. Any variable mentioned in the variable list of the command will be included in the model, unless Stata is forced to omit it due to colinearity or some similar reason that makes mathematical inclusion impossible.

As far as recognition as control variables, I have a two part response. The first is that I dislike the term control variable when we are working with observational data because, in truth, nothing is controlled. We are adjusting for the effects of these variables, not controlling them. So I prefer to just call them covariates. That said, I realize that the term control variable is in widespread use and in some audiences might be more readily recognized than covariate.

The second is that the distinguishing of some variables as covariates and others as main predictors is purely in our mind. There is no mathematical reality. It's just our attitude towards the variables. Some variables we think of as central to our research and it is our goal to estimate their relationships to some outcome. These we call main variables, or key predictors, or key independent variables. There are others that we include because we know or suspect that they also influence the outcome and we want to try to separate out the effect of our key predictor(s), which we care about, from the effects of these other variables (covariates), which we regard as just a nuisance. But mathematically there is no such distinction, and all of these variables are handled in exactly the same way when the regression estimates are calculated. This is true in Stata and in all other statistics packages: it is simply a reflection of the underlying mathematics.

Again, the distinction between a key independent variable and a covariate exists only in our minds. The computer handles them identically: we attend differently to the associated results.
1 like
Comment
maykon feitosa

Join Date: May 2021

Posts: 5
#3

08 May 2021, 07:59

Perfect. Thank you for the answer and the guidance on the term "control variable".
Comment

Announcement

Diff-in-diff with panel data - Control variables

Comment

Comment