Logit FE regression in cross-sectional data.

Ibai Ostolozaga Falcon

Join Date: May 2021

Posts: 36
#1

Logit FE regression in cross-sectional data.

14 Jun 2021, 02:40

Dear Statalist,

I have read about this topic in various discussions but I could not take a clear response.

I have micro cross-sectional data-base (at firm level) for 58 countries. What I am trying to do is a logit regression (my dependent variable is a binary one) controlling for each country due to heterogeneity between countries. Furthermore, I would like to add macro variables like lnGDP (logarithm of the GDP per worker). My code is the next:

logit Y X Z i.country , vce(robust) // where Y is my dependent variable, X is a vector of variables at firm level and Z a vector of variables at country level.

However, doing this, various country levels are omitted. I have read in some papers that it is not possible to include country variables when you are controlling for countries. So, then I decided to run multilevel logit model of two levels (firms and countries). My code is the next

melogit Y X Z || country: , vce(robust)

So, my question is. It is a good idea to run a multilevel model, or maybe I should run the first regression but omitting "i.country" variable and replacing vce(robust) by vce(cluster country)?

I hope you can give me some guidance.

Thanks in advanced,
Ibai
Tags: None
Rhys Williams

Join Date: Apr 2020

Posts: 224
#2

14 Jun 2021, 16:41

The two approaches are different. If you use the logit approach, I would recommend you use clustered standard errors and as you say omit the country fixed effect if Z is time-invariant.

In terms of the melogit, you are allowing the coefficient for each country to vary (random intercept). To my mind this is probably more flexible and therefore somewhat better but ultimately it depends on what you want to achieve

Best,
Rhys
Comment
Ibai Ostolozaga Falcon

Join Date: May 2021

Posts: 36
#3

15 Jun 2021, 00:47

Originally posted by Rhys Williams View Post

The two approaches are different. If you use the logit approach, I would recommend you use clustered standard errors and as you say omit the country fixed effect if Z is time-invariant.

In terms of the melogit, you are allowing the coefficient for each country to vary (random intercept). To my mind this is probably more flexible and therefore somewhat better but ultimately it depends on what you want to achieve

Best,
Rhys

Hi Rhys.

Thank you so much for your helpful advise. What I want to achieve is to repair at the relation between one of the variables of the country vector variables Z, and the outcome. Furthermore, I must take into account the heterogeneity between countries.

Regards,
Ibai
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 702
#4

15 Jun 2021, 01:01

To make it a bit simpler maybe: if you just want to correct the SEs for the similarity within countries, use clustering. This will give you no information in the clusters but reduce bias. However, if you have substantial assumptions what happens between countries and want to test this, use the multilevel model and investigate the random intercepts. There you see how countries actually differ. So either

Code:

logit Y X Z, vce(cluster country)

or

Code:

melogit Y X Z || country:

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Announcement

Logit FE regression in cross-sectional data.

Comment

Comment

Comment