How much power has a question in my questionnaire?

Torbjorn Skodvin

Join Date: Feb 2016
Posts: 22

How much power has a question in my questionnaire?

30 Jul 2019, 14:04

Hi,
I am running statistics on a questionnaire I have conducted. On a particular question, I have two predictor variables that I know correlate somewhat with each other. Both predictor variables are statistically significant on their own (tested using Chi square). When I use logistic regression neither of them are statistically significant. I want to perform a kind of power analysis to check whether or not this might be because of low power (in that case a possible type II error).

Response variable: Dichotomous
Predictor variable 1: Ordinal
Predictor variable 2: Nominal
R¨2 between predictor variables: 0.35

Let' say the question has the alternatives yes/no.

Code:

Predictor variable 1:
                      |   3 quantiles of Predictor 1
  Answer              |         1          2          3 |     Total
----------------------+---------------------------------+----------
Yes                   |        16         42         28 |        86 
No                    |        94        121         45 |       260 
----------------------+---------------------------------+----------
                Total |       110        163         73 |       346 


Predictor variable 2:
                      | Region
  Answer              | Northern   Western E  Southern   Eastern E      Other |     Total
----------------------+-------------------------------------------------------+----------
Yes                   |        27         35         12          9          4 |        87 
No                    |        33         90         82         44         15 |       264 
----------------------+-------------------------------------------------------+----------
                Total |        60        125         94         53         19 |       351

I hope you understand what I mean, do not hesitate to point out any flaws in my reasoning. Can anyone give me helpful advice on how to proceed? Thankyou in advance.

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

31 Jul 2019, 10:59

You didn't get a quick answer. You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

There are power calculators in Stata 16. Whether they do logit, I don't know. However, if you really have 350 observations and only 2 rhs variables, then the issue is not power. I don't know what the columns mean in your tables. But, it may be that you have many more parameters because these take on multiple values (and I assume are treated as creating dummies). But, even with 8 parameters, 340 observations seems like sufficient.

Have you looked at the colinearity diagnostics?
Comment

Torbjorn Skodvin

Join Date: Feb 2016
Posts: 22

10 Aug 2019, 04:16

Thank you for your response, it is definitely helpful although I am struggling..

Sample data using dataex, 5 first observations:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float Questioncase2_dicho byte(GDP_10000USD_Tertile Country_categories)
. 3 1
1 3 1
1 3 1
1 3 1
1 3 1
end
label values Questioncase2_dicho question2_dicho
label def question2_dicho 1 "Prophylactic intervention (endovascular or surgical)", modify
label values Country_categories countrycategories
label def countrycategories 1 "Western Europe", modify
label var GDP_10000USD_Tertile "3 quantiles of GDP_10000USD" 
label var Country_categories "Region (0=Northern, 1=Western, 2=Southern, 3=Eastern, 4=Other)"

The logistic regression:

Code:

. logit Questioncase2_dicho GDP_10000USD_Tertile Country_categories 

Iteration 0:   log likelihood = -194.01672  
Iteration 1:   log likelihood = -185.61497  
Iteration 2:   log likelihood = -185.42824  
Iteration 3:   log likelihood = -185.42796  
Iteration 4:   log likelihood = -185.42796  

Logistic regression                             Number of obs     =        346
                                                LR chi2(2)        =      17.18
                                                Prob > chi2       =     0.0002
Log likelihood = -185.42796                     Pseudo R2         =     0.0443

--------------------------------------------------------------------------------------
 Questioncase2_dicho |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------------+----------------------------------------------------------------
GDP_10000USD_Tertile |  -.3008431   .2525566    -1.19   0.234    -.7958449    .1941587
  Country_categories |   .3453159   .1804626     1.91   0.056    -.0083843    .6990161
               _cons |   1.223779   .7075544     1.73   0.084    -.1630026     2.61056
--------------------------------------------------------------------------------------

.

Here, none of the predictor variables are statistically significant. Tested on their own (logit Questioncase2_dicho GDP_10000USD_Tertile), they both are statistically significant with p=.000.

Below is how I have checked the correlation between the two predictor variables:

Code:

. corr GDP_10000USD_Tertile Country_categories 
(obs=415)

             | GDP_10~e Countr~s
-------------+------------------
GDP_10000U~e |   1.0000
Country_ca~s |  -0.7528   1.0000


. regress GDP_10000USD_Tertile Country_categories 

      Source |       SS           df       MS      Number of obs   =       415
-------------+----------------------------------   F(1, 413)       =    540.08
       Model |  122.017723         1  122.017723   Prob > F        =    0.0000
    Residual |  93.3075778       413  .225926338   R-squared       =    0.5667
-------------+----------------------------------   Adj R-squared   =    0.5656
       Total |  215.325301       414  .520109423   Root MSE        =    .47532

------------------------------------------------------------------------------------
GDP_10000USD_Ter~e |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------------+----------------------------------------------------------------
Country_categories |  -.4900547   .0210871   -23.24   0.000    -.5315061   -.4486033
             _cons |   2.647775   .0413143    64.09   0.000     2.566563    2.728988
------------------------------------------------------------------------------------

Here's how I think:
1. The two predictor variables correlate (R-squared of 0.57).
2. Both predictor variables are statistically significant when tested on their own. None of them are when both are entered in the same logit command.
3. Is this because of low power with the observed effect size (a possible type II error)? I thought I could use some kind of power analysis to partly answer that question.
4. Is this because the two predictor variables are to closely correlated? How do I check that in a good way?

Comment

Torbjorn Skodvin

Join Date: Feb 2016

Posts: 22
#4

21 Aug 2019, 03:31

Maybe my example is too specific, or the questions are framed wrongly. However, I'd guess others have wrestled with similar issues. Has anyone got tips on how I can proceed? I am very thankful for any reply.
Comment

Announcement

How much power has a question in my questionnaire?

Comment

Comment

Comment