Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poisson or Zero Inflated Regression

    Hello, good morning. I am predicting health utilization from two variables: number of times received health care (g5) and emergency care usage (g6).

    Below are the frequency distribution of both variables, and their summary indicators:

    Code:
    
    g5r -- RECODE of g5 (Number of times received health care in last 12 months)
    -----------------------------------------------------------
                  |      Freq.    Percent      Valid       Cum.
    --------------+--------------------------------------------
    Valid   0     |        977      45.74      48.83      48.83
            1     |        284      13.30      14.19      63.02
            2     |        236      11.05      11.79      74.81
            3     |        141       6.60       7.05      81.86
            4     |         79       3.70       3.95      85.81
            5     |         61       2.86       3.05      88.86
            6     |         64       3.00       3.20      92.05
            7     |         12       0.56       0.60      92.65
            8     |         12       0.56       0.60      93.25
            10    |         43       2.01       2.15      95.40
            12    |         19       0.89       0.95      96.35
            15    |         14       0.66       0.70      97.05
            16    |          2       0.09       0.10      97.15
            20    |         24       1.12       1.20      98.35
            24    |          5       0.23       0.25      98.60
            30    |         12       0.56       0.60      99.20
            40    |          2       0.09       0.10      99.30
            45    |          1       0.05       0.05      99.35
            48    |          1       0.05       0.05      99.40
            50    |          8       0.37       0.40      99.80
            60    |          1       0.05       0.05      99.85
            80    |          1       0.05       0.05      99.90
            90    |          1       0.05       0.05      99.95
            96    |          1       0.05       0.05     100.00
            Total |       2001      93.68     100.00          
    Missing .     |        135       6.32                      
    Total         |       2136     100.00                      
    -----------------------------------------------------------
    
    g6r -- RECODE of g6 (Number of times visited emergency room or hospital for own health)
    -----------------------------------------------------------
                  |      Freq.    Percent      Valid       Cum.
    --------------+--------------------------------------------
    Valid   0     |       1050      49.16      90.44      90.44
            1     |         82       3.84       7.06      97.50
            2     |         18       0.84       1.55      99.05
            3     |          5       0.23       0.43      99.48
            4     |          2       0.09       0.17      99.66
            5     |          1       0.05       0.09      99.74
            9     |          1       0.05       0.09      99.83
            12    |          1       0.05       0.09      99.91
            20    |          1       0.05       0.09     100.00
            Total |       1161      54.35     100.00          
    Missing .     |        975      45.65                      
    Total         |       2136     100.00                      
    -----------------------------------------------------------
    
    
    
    
    
    . summarize g5r g6r, detail
    
         RECODE of g5 (Number of times received health care
                         in last 12 months)
    -------------------------------------------------------------
          Percentiles      Smallest
     1%            0              0
     5%            0              0
    10%            0              0       Obs               2,001
    25%            0              0       Sum of Wgt.       2,001
    
    50%            1                      Mean            2.56022
                            Largest       Std. Dev.      6.505882
    75%            3             60
    90%            6             80       Variance        42.3265
    95%           10             90       Skewness       6.804919
    99%           30             96       Kurtosis       68.22297
    
           RECODE of g6 (Number of times visited emergency
                  room or hospital for own health)
    -------------------------------------------------------------
          Percentiles      Smallest
     1%            0              0
     5%            0              0
    10%            0              0       Obs               1,161
    25%            0              0       Sum of Wgt.       1,161
    
    50%            0                      Mean            .161068
                            Largest       Std. Dev.      .8564563
    75%            0              5
    90%            0              9       Variance       .7335175
    95%            1             12       Skewness       14.62739
    99%            2             20       Kurtosis       292.1931

    As you can see there are a lot of "meaningful" zeros in the data, therefore taking logs will do not be appropriate as it will just generate a number of undefined values.

    I am thinking of using a Poisson regression model or zero inflated regression. Is there any test I can use to check which of these models could be used to fit the data?

    Thanks - cY
    Last edited by Yawo Kokuvi; 21 May 2018, 08:27. Reason: provided detailed summarize results

  • #2
    Paul Allison has some advice on this:

    https://statisticalhorizons.com/zero-inflated-models
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      thanks very much... I appreciate the help. Cy

      Comment

      Working...
      X