Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Statistical test for comparing proportion within a time slot and over time

    Hello,
    I have count data by year for three independent regions (A,B,C), and I like to compare the proportions (prop = Count/Population) in 3 region (A,B,C) within a year (e.g. 2011) and over time (e.g.2011-2015) with any statistical tests. Any body can help me with this analysis in Stata , will be highly appreciated.

    Thanks

    The data demo is as follows:

    region year Count Population prop
    A 2011 35 252 0.14
    A 2012 40 274 0.15
    A 2013 45 290 0.16
    A 2014 46 302 0.15
    A 2015 47 320 0.15
    B 2011 34 150 0.23
    B 2012 44 183 0.24
    B 2013 50 203 0.25
    B 2014 55 231 0.24
    B 2015 63 257 0.25
    C 2011 29 317 0.09
    C 2012 34 323 0.11
    C 2013 36 339 0.11
    C 2014 38 347 0.11
    C 2015 45 357 0.13

  • #2
    Khiaja:
    welcome to the list.
    For the future, please use -dataex- to post excerpts/exmples of your dataset (see -search dataex- to insall). Thanks.
    As far as your first question is concerend, you may want to try:
    Code:
    . input region year Count Population prop
    
            region       year      Count  Populat~n       prop ////please note that -region- has been transformed into numerical.
      1.  1 2011 35 252 0.14
      2.
    .  1 2012 40 274 0.15
      3.
    .  1 2013 45 290 0.16
      4.
    .  1 2014 46 302 0.15
      5.
    .  1 2015 47 320 0.15
      6.
    .  2 2011 34 150 0.23
      7.
    .  2 2012 44 183 0.24
      8.
    .  2 2013 50 203 0.25
      9.
    .  2 2014 55 231 0.24
     10.
    .  2 2015 63 257 0.25
     11.
    .  3 2011 29 317 0.09
     12.
    .  3 2012 34 323 0.11
     13.
    .  3 2013 36 339 0.11
     14.
    .  3 2014 38 347 0.11
     15.
    .  3 2015 45 357 0.13
     16.
    . end
    
    
    . poisson Count i.region if year==2011, exposure(Population)
    
    Iteration 0:   log likelihood = -7.9890231 
    Iteration 1:   log likelihood = -7.9890231 
    
    Poisson regression                              Number of obs     =          3
                                                    LR chi2(2)        =      12.78
                                                    Prob > chi2       =     0.0017
    Log likelihood = -7.9890231                     Pseudo R2         =     0.4444
    
    ------------------------------------------------------------------------------
           Count |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          region |
              2  |   .4898063    .240797     2.03   0.042     .0178528    .9617597
              3  |  -.4175249   .2511059    -1.66   0.096    -.9096835    .0746337
                 |
           _cons |  -1.974081   .1690309   -11.68   0.000    -2.305375   -1.642787
    ln(Popula~n) |          1  (exposure)
    ------------------------------------------------------------------------------
    A possible answer to your second query is -xtpoisson-.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Thank you Carlo for your help. What do you think about goodness of fit test (estat gof) for Poisson regression?
      I also tried negative Binomial (nbreg), but does not make more sense as it is proportion and the real data is more dispersed.

      Comment


      • #4
        Khoaja:
        with -poission- overdispersion is the big deal.
        -nbreg- makes sense if -estat gof- shows statistical significance.
        I find difficult to reply more positively without seeing what you typed and what Stata gave you back (as per FAQ).
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Thanks Carlo. This is the Stata syntax and output:
          nbreg Count i.region if year==2011, exposure(Population)

          Negative binomial regression Number of obs = 3
          LR chi2(2) = 7.51
          Dispersion = mean Prob > chi2 = 0.0234
          Log likelihood = -7.9890235 Pseudo R2 = 0.3197


          Count Coef. Std. Err. z P>z [95% Conf. Interval]

          region
          B .489806 .2407963 2.03 0.042 .0178538 .9617581
          C -.4175267 .2511053 -1.66 0.096 -.9096841 .0746308

          _cons -1.974075 .1690304 -11.68 0.000 -2.305369 -1.642782
          ln(Popula~n) 1 (exposure)

          /lnalpha -18.46364 1866.657 -3677.043 3640.116

          alpha 9.58e-09 .0000179 0 .

          LR test of alpha=0: chibar2(01) = 0.0e+00 Prob >= chibar2 = 0.500

          Comment


          • #6
            Khoaja:
            as per FAQ, please post what you typed and what Stata gave you back within CODE delimiters (things are difficult to read in your last post). Thanks.
            That said, the issue there is that you have tried inference on a too limited sample size.
            No wonder your results are bewildering; they are simply unreliable.
            Try to increase your sample size or any inference will be meaningless.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment

            Working...
            X