Poisson or Zero Inflated Regression

Yawo Kokuvi

Join Date: May 2015
Posts: 137

Poisson or Zero Inflated Regression

21 May 2018, 08:24

Hello, good morning. I am predicting health utilization from two variables: number of times received health care (g5) and emergency care usage (g6).

Below are the frequency distribution of both variables, and their summary indicators:

Code:


g5r -- RECODE of g5 (Number of times received health care in last 12 months)
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |        977      45.74      48.83      48.83
        1     |        284      13.30      14.19      63.02
        2     |        236      11.05      11.79      74.81
        3     |        141       6.60       7.05      81.86
        4     |         79       3.70       3.95      85.81
        5     |         61       2.86       3.05      88.86
        6     |         64       3.00       3.20      92.05
        7     |         12       0.56       0.60      92.65
        8     |         12       0.56       0.60      93.25
        10    |         43       2.01       2.15      95.40
        12    |         19       0.89       0.95      96.35
        15    |         14       0.66       0.70      97.05
        16    |          2       0.09       0.10      97.15
        20    |         24       1.12       1.20      98.35
        24    |          5       0.23       0.25      98.60
        30    |         12       0.56       0.60      99.20
        40    |          2       0.09       0.10      99.30
        45    |          1       0.05       0.05      99.35
        48    |          1       0.05       0.05      99.40
        50    |          8       0.37       0.40      99.80
        60    |          1       0.05       0.05      99.85
        80    |          1       0.05       0.05      99.90
        90    |          1       0.05       0.05      99.95
        96    |          1       0.05       0.05     100.00
        Total |       2001      93.68     100.00          
Missing .     |        135       6.32                      
Total         |       2136     100.00                      
-----------------------------------------------------------

g6r -- RECODE of g6 (Number of times visited emergency room or hospital for own health)
-----------------------------------------------------------
              |      Freq.    Percent      Valid       Cum.
--------------+--------------------------------------------
Valid   0     |       1050      49.16      90.44      90.44
        1     |         82       3.84       7.06      97.50
        2     |         18       0.84       1.55      99.05
        3     |          5       0.23       0.43      99.48
        4     |          2       0.09       0.17      99.66
        5     |          1       0.05       0.09      99.74
        9     |          1       0.05       0.09      99.83
        12    |          1       0.05       0.09      99.91
        20    |          1       0.05       0.09     100.00
        Total |       1161      54.35     100.00          
Missing .     |        975      45.65                      
Total         |       2136     100.00                      
-----------------------------------------------------------





. summarize g5r g6r, detail

     RECODE of g5 (Number of times received health care
                     in last 12 months)
-------------------------------------------------------------
      Percentiles      Smallest
 1%            0              0
 5%            0              0
10%            0              0       Obs               2,001
25%            0              0       Sum of Wgt.       2,001

50%            1                      Mean            2.56022
                        Largest       Std. Dev.      6.505882
75%            3             60
90%            6             80       Variance        42.3265
95%           10             90       Skewness       6.804919
99%           30             96       Kurtosis       68.22297

       RECODE of g6 (Number of times visited emergency
              room or hospital for own health)
-------------------------------------------------------------
      Percentiles      Smallest
 1%            0              0
 5%            0              0
10%            0              0       Obs               1,161
25%            0              0       Sum of Wgt.       1,161

50%            0                      Mean            .161068
                        Largest       Std. Dev.      .8564563
75%            0              5
90%            0              9       Variance       .7335175
95%            1             12       Skewness       14.62739
99%            2             20       Kurtosis       292.1931

As you can see there are a lot of "meaningful" zeros in the data, therefore taking logs will do not be appropriate as it will just generate a number of undefined values.

I am thinking of using a Poisson regression model or zero inflated regression. Is there any test I can use to check which of these models could be used to fit the data?

Thanks - cY

Last edited by Yawo Kokuvi; 21 May 2018, 08:27. Reason: provided detailed summarize results

Tags: None

Richard Williams

Join Date: Apr 2014

Posts: 4983
#2

21 May 2018, 08:34

Paul Allison has some advice on this:

https://statisticalhorizons.com/zero-inflated-models

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Yawo Kokuvi

Join Date: May 2015

Posts: 137
#3

23 May 2018, 19:38

thanks very much... I appreciate the help. Cy
Comment

Announcement

Poisson or Zero Inflated Regression

Comment

Comment