Compare large vs small firms, panel data, omitted variable

Josephine Nicolai

Join Date: Jun 2021

Posts: 20
#1

Compare large vs small firms, panel data, omitted variable

11 Jan 2023, 07:35

Hi everyone,

I'm examining the effect of partner gender diversity on the audit quality. This includes both Big4 audit firms, and non-Big4 firms. For my additional analysis I would like see whether the results are smaller/stronger for big4 or non-big4 firms. Audit firms with ID 1,2,3,4 are the Big4 firms, so this is what I thought I had to do (see code). abs_ModDACC means abnormal discretionary accruals en GDR is gender diversity ratio. When GDR increases, the abs_ModDACC decreases (negative relationship). The regression includes year and industry fixed effects.

Code:

gen AID = AuditID replace AID = 0 if AuditID > 4 xtreg abs_ModDACC GDR $control_vars i.industry i.Year i.AID, re vce(cluster id) testparm i.AID

However, the results omit AID=4 (see screenshot), but why is that? I know that there should be a reference category, but I want to interpret my results as: the 4 biggest audit firms have lower discretionary accruals accruals compared to the rest of the audit firms. So isn't it that all non-big4 firms should be the reference category?

I hope someone can help me out!

Kind regards,
Josephine
Attached Files
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17706

11 Jan 2023, 09:22

Josephine_
each and every categorical variable has its own (omitted) reference category, as you can see from the following toy-example:

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage i.race i.nev_mar, re vce(cluster idcode)

Random-effects GLS regression Number of obs = 28,518
Group variable: idcode Number of groups = 4,711

R-squared: Obs per group:
Within = 0.0263 min = 1
Between = 0.0121 avg = 6.1
Overall = 0.0145 max = 15

Wald chi2(3) = 429.57
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

(Std. err. adjusted for 4,711 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
race |
Black | -.110084 .01332 -8.26 0.000 -.1361908 -.0839772
Other | .1165283 .0666152 1.75 0.080 -.014035 .2470917
|
1.nev_mar | -.1611142 .0087208 -18.47 0.000 -.1782066 -.1440217
_cons | 1.72454 .0074549 231.33 0.000 1.709929 1.739152
-------------+----------------------------------------------------------------
sigma_u | .38311279
sigma_e | .3159974
rho | .59512448 (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

As per FAQ, please avoid posting screenshots but share what you typed and what Stata gave you back via CODE delimiters. Thanks.

Kind regards,
Carlo
(Stata 19.0)

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30083
#3

11 Jan 2023, 10:18

The reference category for your AID variable is, indeed, 0--the category designating non-big4 firms. When Stata omits the reference category, it says nothing about it, because it is expected. The fact that Stata makes a point of telling you that value 4 is also being omitted tells you that something special is going on. That something special, undoubtedly, is colinearity with something else. In the code and output you shows you omitted what is probably the most crucial command for solving this puzzle: your -xtset- command. And in showing the screenshots you also cut off the part of the -xtreg- output where the grouping variable is shown. So we are flying blind here.

Most likely, however, I will guess you set up as your panel variable something whereby the AuditID variable is designating a subset of them. That would automatically create this kind of colinearity.
1 like
Comment

Announcement

Compare large vs small firms, panel data, omitted variable

Comment

Comment