controls with high kurtosis and skewness negative binomial regression

Steve Mullingan

Join Date: Apr 2017
Posts: 5

controls with high kurtosis and skewness negative binomial regression

09 May 2017, 07:17

In testing the relationship between CSR performance and the amount of institutional owners, a few control variables show high kurtosis and high skewness. I read that this is problematic when performing regressions which follows 'normality' assumptions. However I can't find anything about this for negative binomial regression. I have three controls: Return on assets (ratio), total assets (continuous in dollars) and long term debt to assets (ratio). When doing regressions, total assets behaves abnormally and shows a high standard error, unrealistic CI values and a very big coefficient. When the logarithmic variable of total assets in included, the variable behaves normally and shows normal statistics. However, return on assets and long term debt to assets also show high kurtosis and skew, should I transform these variables as well to normalize them?

Code:

   stats |  IOAmou~G      CSRP       ROA      DWTA   dwtalog  dwtadebt
---------+------------------------------------------------------------
    mean |  405.6184  56.28656  .0393484  1.56e+07  15.54471  .2970013
     p50 |       315    51.475   .045479   6241000  15.64665  .2740095
variance |  95035.41  496.5804  .0239982  7.54e+14  2.524792  .0663786
      sd |  308.2781  22.28408  .1549136  2.75e+07  1.588959  .2576404
   range |      1932  86.25333  2.827587  3.45e+08  10.19739  5.120585
skewness |  1.921257  .1988748 -3.387187  5.290195 -.4188706  4.779085
kurtosis |  6.989161  1.699786   37.0828  43.41078  3.051194  64.62668
se(mean) |  6.912344  .5329962  .0031936  549453.1  .0317983  .0053147
       N |      1989      1748      2353      2497      2497      2350
     sum |    806775   98388.9  92.58678  3.89e+10  38815.13   697.953
     min |        12  11.20667 -1.822825     12863  9.462111         0
     max |      1944     97.46  1.004762  3.45e+08   19.6595  5.120585
----------------------------------------------------------------------

Tags: None

Nick Cox

Join Date: Mar 2014

Posts: 35727
#2

09 May 2017, 07:42

There is no assumption in any kind of regression that predictors have particular marginal distributions, e.g. normal distributions.

This is a common myth. Conversely, there are plenty of texts that do explain this, including those by distinguished Statalist member Jeff Wooldridge

At most, it is ideal if, conditionally on the predictors, errors are normal, but that's just about the least important ideal condition, and not what negative binomial regression assumes.

(A personal hobby-horse is that this territory would be easier to understand if we stopped talking about assumptions and started talking about ideal conditions. The term assumption tempts some into a kind of logic that if we make these assumptions, then everything is fine, but that mistakes wishful thinking for deduction. Similarly, and perhaps more commonly, assumption is often misread as prerequisite, a translation stronger than the logic allows.)

What is true sometimes is that high skewness and/or kurtosis may go together with outliers or stretched tails that pose problems of robustness or resistance for estimation and/or cast doubt on the model specification. What is also true sometimes is that such features indirectly suggest that predictors may be better analysed on a transformed scale.

This is contentious. For every experienced researcher happy to transform predictors if it seems to help, there is another most reluctant to do so, not least because it make interpretation a little more challenging.

I am no kind of expert on negative binomial regression, but in broad terms what is awkward in predictor space could be awkward for any kind of regression, just acting differently.

I'd recommend plotting residuals against predictors, whatever you do, although discreteness in the response can make these graphs a little puzzling sometimes.
1 like
Comment

Announcement

controls with high kurtosis and skewness negative binomial regression

Comment