I have data from a study assessing spores/g from 3 different sources (bee brood, brood honey, and super honey) from 10 - 20 beehives within 2 bee yards per bee producers. One of the bee yards per producer was clinically affected with disease X and the other bee yard per producer was clinically not affected. The goal is to compare the affected and unaffected yards to see if the outcome (spores/g) differ. I have chosen to focus this post on just one of the outcome sources - bee brood just for simplification sake.
Many of the outcome results are zeros (true zeros and not just a matter of detection limits). The data looks skewed toward zero. However, these are not counts (integer values) as there are decimal places. I think I should be using Poisson but there is no offset as the outcome is spores/g. Can I still use Poisson regression? Or do I need to use negative binomial or even zero-inflated models?
The following are the median values for spores/g from bee brood for all hives in each yard (where 0 is clinically not affected and 1 is clinically affected yards). The IQR is listed below the median for the bee yards.
table affected beeid, contents(median broodbees iqr broodbees)
---------------------------------
| BEEID
affected | 1 2 3
----------+---------------------
0 | 0 .4 .45
| 0 5.4 36.2
|
1 | 1.3 .4 3.05
| 1.9 1.15 6.15
----------------------------------
If I do use a Poisson regression (accounting for bee producer id as a fixed factor), I get a note about interpretation as the outcomes are not counts (which I am worried about).
. poisson broodbees affected ib(first).beeid
note: you are responsible for interpretation of noncount dep. variable
Iteration 0: log likelihood = -5499716.6
Iteration 1: log likelihood = -5446501.9
Iteration 2: log likelihood = -5440771
Iteration 3: log likelihood = -5439890.3
Iteration 4: log likelihood = -5439723.5
Iteration 5: log likelihood = -5439712.9
Iteration 6: log likelihood = -5439712.8
Iteration 7: log likelihood = -5439712.8
Poisson regression Number of obs = 90
LR chi2(3) = 2051748.44
Prob > chi2 = 0.0000
Log likelihood = -5439712.8 Pseudo R2 = 0.1587
------------------------------------------------------------------------------
broodbees | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
affected | 1.083511 .0021099 513.53 0.000 1.079375 1.087646
|
beeid |
2 | -10.25816 .1517964 -67.58 0.000 -10.55568 -9.960647
3 | -.7625571 .0015938 -478.44 0.000 -.765681 -.7594332
|
_cons | 9.793063 .0020161 4857.48 0.000 9.789112 9.797014
------------------------------------------------------------------------------
Any advice would be greatly appreciated.
Many of the outcome results are zeros (true zeros and not just a matter of detection limits). The data looks skewed toward zero. However, these are not counts (integer values) as there are decimal places. I think I should be using Poisson but there is no offset as the outcome is spores/g. Can I still use Poisson regression? Or do I need to use negative binomial or even zero-inflated models?
The following are the median values for spores/g from bee brood for all hives in each yard (where 0 is clinically not affected and 1 is clinically affected yards). The IQR is listed below the median for the bee yards.
table affected beeid, contents(median broodbees iqr broodbees)
---------------------------------
| BEEID
affected | 1 2 3
----------+---------------------
0 | 0 .4 .45
| 0 5.4 36.2
|
1 | 1.3 .4 3.05
| 1.9 1.15 6.15
----------------------------------
If I do use a Poisson regression (accounting for bee producer id as a fixed factor), I get a note about interpretation as the outcomes are not counts (which I am worried about).
. poisson broodbees affected ib(first).beeid
note: you are responsible for interpretation of noncount dep. variable
Iteration 0: log likelihood = -5499716.6
Iteration 1: log likelihood = -5446501.9
Iteration 2: log likelihood = -5440771
Iteration 3: log likelihood = -5439890.3
Iteration 4: log likelihood = -5439723.5
Iteration 5: log likelihood = -5439712.9
Iteration 6: log likelihood = -5439712.8
Iteration 7: log likelihood = -5439712.8
Poisson regression Number of obs = 90
LR chi2(3) = 2051748.44
Prob > chi2 = 0.0000
Log likelihood = -5439712.8 Pseudo R2 = 0.1587
------------------------------------------------------------------------------
broodbees | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
affected | 1.083511 .0021099 513.53 0.000 1.079375 1.087646
|
beeid |
2 | -10.25816 .1517964 -67.58 0.000 -10.55568 -9.960647
3 | -.7625571 .0015938 -478.44 0.000 -.765681 -.7594332
|
_cons | 9.793063 .0020161 4857.48 0.000 9.789112 9.797014
------------------------------------------------------------------------------
Any advice would be greatly appreciated.
Comment