Categorical variable as explanatory variable (right hand side)

Marcel Campion

Join Date: Feb 2017

Posts: 30
#1

Categorical variable as explanatory variable (right hand side)

11 Apr 2018, 12:15

Hi all,

In a linear probability model, or any sort of regression, one can use fixed effect estimation by simply adding in a STATA code i.something. This "something" can be either a village, a county or a country. When doing so one look at variation within this geographical unit, as follow:

Y_vit=B₀+B₁X_it+B₂X_vt+α_v+ϵ_vit

Where indexes _i, _v and _t represent respectively individual, village and time dimensions. The term α_v stands for village fixed effect thus any regression will look at within village variation.

Code:

reg Y Var1 Var2 i.village, vce(cluster village)

Here I come to the point. In the set of covariates that I am using there is one categorical variable taking several different values. This categorical variable can represent colors, insurance company or ethnicity etc. In STATA I introduce this variable as i.categorical. Thus the STATA code becomes:

Code:

reg Y Var1 Var2 i.categorical i.village, vce(cluster village)

I have a hard time interpreting the implication of this regression. When running such regression, am I looking at variation within categories within village? That is looking at variation in Y for individuals belonging to the same category within a same village.

Thank you!

Last edited by Marcel Campion; 11 Apr 2018, 12:49.
Tags: categorical, fixed effects, panel, panel data, regression
Clyde Schechter

Join Date: Apr 2014

Posts: 30027
#2

11 Apr 2018, 13:26

In a linear probability model, or any sort of regression, one can use fixed effect estimation by simply adding in a STATA code i.something.[emphasis added]

Not true. This is only valid with linear regressions. For logistic or other non-linear models, adding i.something is not equivalent to a fixed-effects model.

That said, since you are doing linear regression, why do you want to do it this way? You'll get more compact output if you omit i.something and do

Code:

xtset something xtreg Y Var1 Var2, fe vce(cluster something)

As for the interpretation of

Code:

reg Y Var1 Var2 i.categorical i.village, vce(cluster village) // OR EQUIVALENTLY xtreg Y Var1 Var2 i.categorical, vce(cluster village)

you are estimating the within-village effect of the variable categorical on Y.
Comment
Marcel Campion

Join Date: Feb 2017

Posts: 30
#3

11 Apr 2018, 13:46

Hi Clyde, thank for your reply.

Yes I agree with you when I wrote any sort of regression I was thinking any sort of linear regressions.

That being said yes I could use the code you mentioned first.

For the second part of your answer, yes I am estimating the within-village effect. My question was more about the coefficient on the other variable, say Var1.

My understanding is that by adding the i.categorical I am adding some sort of fixed effect in the sense that I am comparing individuals within a same category located in a same village. The reason I am focusing on that is that for some villages I have only one individual per category and thus no variation within that category. I was wondering whether this affects the coefficient of my other explanatory variables.

Thank!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30027
#4

11 Apr 2018, 13:52

It's still going to be the case that the coefficient of Var1 is the estimate of the within-village estimate of the effect of Var1. Because you did not put i. in front of Var1, I had been assuming that Var1 is continuous, not categorical. Nevertheless, even as a categorical variable, what matters is that Var1 be variable within villages. The fact that there may be only 1 observation per level of Var1 in some or all villages is not a problem.
Comment

Announcement

Categorical variable as explanatory variable (right hand side)

Comment

Comment

Comment