Dear readers & contributers!

I am completely new here, a master (Msc. Sustainable Finance) student just a few weeks removed from leaving the world of academia behind, gratefully Please excuse me if I do something wrong in this post, I have read the FAQ and will try to apply the CODE tags!

I am running my data analysis on Stata 14.0 right now, and would appreciate it immensely if you could find the time to advice me on my issue!

A tiny little background: I am researching whether or not BGD (= board gender diversity, variable name GR) has an impact on the environmental & social (EIVA and SIVA, respectfully) performance of a company. I do this in the context of neo-institutional theory, meaning that I gathered sufficient data to do a pan-continental analysis (comparing 69 countries grouped in 4 classes, depending on how well-institutionalized corporate sustainability is in this class) and see whether the BGD has more/less impact on EIVA & SIVA. I have a panel data set with 18,573 firm-year observations, years range from 2015-2018

My variables are:

dependent: EIVA or SIVA

independent:

- GR: gender ratio; the higher this number the more male-dominant the board of directors is; the MAIN variable of interest

- NM: ratio representing the nationality mix

- market cap, revenue and debt as control variables - variable names MC, Revenu and Debt

- nordicEU; westernEU; thirdgroup and fourthgroup = the 4 groups of classified countries; THESE ARE ALL DUMMY VARIABLES!

- sectornum: the sector a company is in, which I
from string to numeric to be able to create dummy variables with
- year, also a dummy variable which I use to control for macro-economic changes in each year

- 69 country dummy variables which I do not use directly in the regression but used to create the 4 classes

I am trying to decide between the fixed effects and random effects, whereby:

a) the Hausman test clearly points to the fixed effects

b) the random effects model results are EXACTLY what I wanted to show for both the GR variable and the 4 country-classes; it is in line with the literature and my own rationale

c) one of my Finance professors had a (too short) talk with me recently whereby he criticised the use of FE for ESG/sustainability data, as, according to him, this is very noisy data and a FE estimator would simply take away the little meaningful variation we have in the data and regress predominantly, noise. As I mentioned the Hausman test he brushed it off by saying it is often very biased. He had to rush away and I had no chance to express just how unclear it was what he said, yet I feel like his remark could help me out with writing a convincing methodology in favor of the RE. Does any of you have an idea what he could have meant? He is abroad now and will be so due to family circumstances for a couple of months, and it would be highly inappropriate to bother him.

now, my regressions are:

when it comes to the FE regression, I had to leave out all the 4 country classifications and the sector dummies since Stata just dropped them due to collinearity. This is a HUGE problem for me since I need the coefficients on the 4 classes as a significant part of what I am trying to contribute with this master thesis!!

The coefficient on GR under the FE estimator is positive (which would mean that the less women directors there are on a board the better the environmental performance, going against literally ALL literature), whereas the coefficient is negative under the RE estimator, and like I already said, the RE model makes sense overall. Yet the Prob>chi2 of the Hausman is 0.000, begging for an inconsistent random effects estimator. I ran the Hausman before I included robust standard errors.

So now that you know (in case you had the wonderful patience to actually read all of this) the situation, my questions specifically are:

1. Does anyone have a clue what the professor could have meant with the noisy data not being appropriate for FE, and vice versa?

2. Is my coding in Stata even correct?

3. How can I make the smart choice between FE and RE? Or would you suggest another model that would allow to estimate the 4 classes of countries?

3. Would you know any academic articles/literature in general that might help me further/back up using the RE?

THANK YOU so much in advance, I am so happy to have found this Stata-community & forum!

Kind regards,

Amira

I am completely new here, a master (Msc. Sustainable Finance) student just a few weeks removed from leaving the world of academia behind, gratefully Please excuse me if I do something wrong in this post, I have read the FAQ and will try to apply the CODE tags!

I am running my data analysis on Stata 14.0 right now, and would appreciate it immensely if you could find the time to advice me on my issue!

A tiny little background: I am researching whether or not BGD (= board gender diversity, variable name GR) has an impact on the environmental & social (EIVA and SIVA, respectfully) performance of a company. I do this in the context of neo-institutional theory, meaning that I gathered sufficient data to do a pan-continental analysis (comparing 69 countries grouped in 4 classes, depending on how well-institutionalized corporate sustainability is in this class) and see whether the BGD has more/less impact on EIVA & SIVA. I have a panel data set with 18,573 firm-year observations, years range from 2015-2018

My variables are:

dependent: EIVA or SIVA

independent:

- GR: gender ratio; the higher this number the more male-dominant the board of directors is; the MAIN variable of interest

- NM: ratio representing the nationality mix

- market cap, revenue and debt as control variables - variable names MC, Revenu and Debt

- nordicEU; westernEU; thirdgroup and fourthgroup = the 4 groups of classified countries; THESE ARE ALL DUMMY VARIABLES!

- sectornum: the sector a company is in, which I

Code:

encode

Code:

i.sectornum

- 69 country dummy variables which I do not use directly in the regression but used to create the 4 classes

I am trying to decide between the fixed effects and random effects, whereby:

a) the Hausman test clearly points to the fixed effects

b) the random effects model results are EXACTLY what I wanted to show for both the GR variable and the 4 country-classes; it is in line with the literature and my own rationale

c) one of my Finance professors had a (too short) talk with me recently whereby he criticised the use of FE for ESG/sustainability data, as, according to him, this is very noisy data and a FE estimator would simply take away the little meaningful variation we have in the data and regress predominantly, noise. As I mentioned the Hausman test he brushed it off by saying it is often very biased. He had to rush away and I had no chance to express just how unclear it was what he said, yet I feel like his remark could help me out with writing a convincing methodology in favor of the RE. Does any of you have an idea what he could have meant? He is abroad now and will be so due to family circumstances for a couple of months, and it would be highly inappropriate to bother him.

now, my regressions are:

Code:

xtreg EIVA GR NM MC Revenu Debt nordicEU westernEU thirdgroup fourthgroup i.sectornum i.year, re vce(robust)

Code:

xtreg EIVA GR NM MC Revenu Debt i.year, fe vce(robust)

The coefficient on GR under the FE estimator is positive (which would mean that the less women directors there are on a board the better the environmental performance, going against literally ALL literature), whereas the coefficient is negative under the RE estimator, and like I already said, the RE model makes sense overall. Yet the Prob>chi2 of the Hausman is 0.000, begging for an inconsistent random effects estimator. I ran the Hausman before I included robust standard errors.

So now that you know (in case you had the wonderful patience to actually read all of this) the situation, my questions specifically are:

1. Does anyone have a clue what the professor could have meant with the noisy data not being appropriate for FE, and vice versa?

2. Is my coding in Stata even correct?

3. How can I make the smart choice between FE and RE? Or would you suggest another model that would allow to estimate the 4 classes of countries?

3. Would you know any academic articles/literature in general that might help me further/back up using the RE?

THANK YOU so much in advance, I am so happy to have found this Stata-community & forum!

Kind regards,

Amira

## Comment