Erratic bounds of C.I. for Eta-Square in -estat esize-

Ulrich Kohler

Join Date: May 2014
Posts: 89

Erratic bounds of C.I. for Eta-Square in -estat esize-

04 Dec 2014, 03:14

Hi all,

we are struggling with erratic results for the confidence intervalls of the (partial) Eta-Squared in the output of -estat esize-. For one of our models we find, for example, an upper bound of the C.I. of .452 at the confidence level of 69 which jumbs to 1 at the confidence level of 70. According to our investigations, the problem stems from -npnF()- , the function for the "noncentrality parameter of the noncentral F distribution", which is being used by -estat esize-. The function returns missing for many models that we are estimating, and this leads to C.I. bounds of 0 or 1, respectively.

Now we wonder if there is a workaround for the problem. Is there an alternative way to estimate confidence intervalls for the partial Eta-Squares? Dominance Analyis? As it seems that -npnF()- does not return values higher than 9999, is there a reasonable approximation of that noncentrality parameter that we can feed into the formula for the confidence bound when we calculate them "by hand"?

Here are the details. First an illustration of the problem itself (weights does not matter here, btw):

Code:

. reg ln_perm_adj ln_curradj1 if random1adj [aw=cweight]
(sum of wgt is   2.5546e+05)

      Source |       SS       df       MS              Number of obs =   12131
-------------+------------------------------           F(  1, 12129) = 9760.87
       Model |  1381.63586     1  1381.63586           Prob > F      =  0.0000
    Residual |  1716.84057 12129  .141548402           R-squared =  0.4459
-------------+------------------------------           Adj R-squared =  0.4459
       Total |  3098.47644 12130  .255439113           Root MSE =  .37623

------------------------------------------------------------------------------
 ln_perm_adj |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
 ln_curradj1 |   .4417726   .0044715    98.80   0.000 .4330077    .4505374
       _cons |   5.773714   .0456748   126.41   0.000 5.684184    5.863244
------------------------------------------------------------------------------

. estat esize, level(70)

Effect sizes for linear models

-------------------------------------------------------------------
             Source |   Eta-Squared     df     [70% Conf. Interval]
--------------------+----------------------------------------------
              Model |   .4459081         1     .4396887           1
                    |
        ln_curradj1 |   .4459081         1     .4396887           1
-------------------------------------------------------------------

. estat esize, level(69)

Effect sizes for linear models

-------------------------------------------------------------------
             Source |   Eta-Squared     df     [69% Conf. Interval]
--------------------+----------------------------------------------
              Model |   .4459081         1     .4398155    .4518417
                    |
        ln_curradj1 |   .4459081         1     .4398155    .4518417
-------------------------------------------------------------------

The C.I. for the (model) Eta-Squared of the above model can be reproduced "by hand" as follows (see [R] esize for details):

Code:

. local df1 = 12129
. local df2 = 1
. local F = 9760.87
. local alpha = (100 - <LEVEL>)/100
. local lambda_lower = npnF(`df2',`df1', `F',1-`alpha'/2)
. local lambda_upper = npnF(`df2',`df1', `F',`alpha'/2)
. local ci_lower = max(0,`lambda_lower'/(`lambda_lower' + `df1' + `df2' + 1))
. local ci_upper = min(1,`lambda_upper'/(`lambda_upper' + `df1' + `df2' + 1))

The calculation for the upper bound of the C.I. jumbs to 1 if values of 70 or higher are inserted for <LEVEL> into to the commands above. The reason for this is that the local "lambda_upper" is defined to be "missing" by -npnf()-. Likewise, "ci_lower" would be defined to be zero if lambda_lower were missing.

By and large, the npnF() function returns missing if F becomes large (and hence if the number of observations or the proportion of explained variance increases.) So my question boils down to the question how to estimate a confidence intervals for a partial Eta-squared for models with many observations and relatively high proportions of explained variance.

Uli

Tags: None

Ulrich Kohler

Join Date: May 2014

Posts: 89
#2

05 Dec 2014, 05:07

Just a follow up: I found an article by Ali Baharev and Sándor Kemény (2008) claiming that many algorithms for computing the noncentrality parameter of the noncentral F distribution return incorrect or no results. This might or might not be related to the reported behavior of -npnF()-.

Ali Baharev and Sándor Kemény (2008): On the computation of the noncentral F and noncentral beta distributions. Statistics and Computing, 2008, 18 (3), 333-340. http://dx.doi.org/10.1007/s11222-008-9061-3
Comment
Chuck Huber (StataCorp)

StataCorp Employee

Join Date: Apr 2014

Posts: 3
#3

18 Dec 2014, 07:44

After a careful look at the code in estat esize that produces the
confidence limits for Eta-squared, we must agree with Uli's assessment
for the erratic results.

The bottom line here is, if the non-centrality parameter could not be
computed, then estat esize should just report a missing value for
the corresponding CI limit. This will happen in a future update to
Stata 13.

We thank Uli for pointing out the problem and investigating the cause.
Comment

Announcement

Erratic bounds of C.I. for Eta-Square in -estat esize-

Comment

Comment