Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compare large vs small firms, panel data, omitted variable

    Hi everyone,

    I'm examining the effect of partner gender diversity on the audit quality. This includes both Big4 audit firms, and non-Big4 firms. For my additional analysis I would like see whether the results are smaller/stronger for big4 or non-big4 firms. Audit firms with ID 1,2,3,4 are the Big4 firms, so this is what I thought I had to do (see code). abs_ModDACC means abnormal discretionary accruals en GDR is gender diversity ratio. When GDR increases, the abs_ModDACC decreases (negative relationship). The regression includes year and industry fixed effects.
    Code:
    gen AID = AuditID
    replace AID = 0 if AuditID > 4
    xtreg abs_ModDACC GDR $control_vars i.industry i.Year i.AID, re vce(cluster id)
    testparm i.AID
    However, the results omit AID=4 (see screenshot), but why is that? I know that there should be a reference category, but I want to interpret my results as: the 4 biggest audit firms have lower discretionary accruals accruals compared to the rest of the audit firms. So isn't it that all non-big4 firms should be the reference category?

    I hope someone can help me out!

    Kind regards,
    Josephine
    Attached Files

  • #2
    Josephine_
    each and every categorical variable has its own (omitted) reference category, as you can see from the following toy-example:
    Code:
    . use "https://www.stata-press.com/data/r17/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . xtreg ln_wage i.race i.nev_mar, re vce(cluster idcode)
    
    Random-effects GLS regression Number of obs = 28,518
    Group variable: idcode Number of groups = 4,711
    
    R-squared: Obs per group:
    Within = 0.0263 min = 1
    Between = 0.0121 avg = 6.1
    Overall = 0.0145 max = 15
    
    Wald chi2(3) = 429.57
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
    
    (Std. err. adjusted for 4,711 clusters in idcode)
    ------------------------------------------------------------------------------
    | Robust
    ln_wage | Coefficient std. err. z P>|z| [95% conf. interval]
    -------------+----------------------------------------------------------------
    race |
    Black | -.110084 .01332 -8.26 0.000 -.1361908 -.0839772
    Other | .1165283 .0666152 1.75 0.080 -.014035 .2470917
    |
    1.nev_mar | -.1611142 .0087208 -18.47 0.000 -.1782066 -.1440217
    _cons | 1.72454 .0074549 231.33 0.000 1.709929 1.739152
    -------------+----------------------------------------------------------------
    sigma_u | .38311279
    sigma_e | .3159974
    rho | .59512448 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    .
    As per FAQ, please avoid posting screenshots but share what you typed and what Stata gave you back via CODE delimiters. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      The reference category for your AID variable is, indeed, 0--the category designating non-big4 firms. When Stata omits the reference category, it says nothing about it, because it is expected. The fact that Stata makes a point of telling you that value 4 is also being omitted tells you that something special is going on. That something special, undoubtedly, is colinearity with something else. In the code and output you shows you omitted what is probably the most crucial command for solving this puzzle: your -xtset- command. And in showing the screenshots you also cut off the part of the -xtreg- output where the grouping variable is shown. So we are flying blind here.

      Most likely, however, I will guess you set up as your panel variable something whereby the AuditID variable is designating a subset of them. That would automatically create this kind of colinearity.

      Comment

      Working...
      X