Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • proportion as a dependent variable

    Hi all,

    I have data where the dependent variable is a proportion that was was created by taking the mean of a few other proportions. There are no values of exactly 0 and 28% of the values are exactly 1.

    I have read about running a generalized linear model (glm) (from: http://www.ats.ucla.edu/stat/stata/faq/proportion.htm) & I have also read Baum's 2008 Stata journal article about modeling proportions.

    I'm happy to run the glm model. However, I'm wondering whether there are any alternative models that I should also try with this type of data? (& that can also be implemented in Stata)?

    Any thoughts would be much appreciated.

    Carrie

  • #2
    Hi, I just had a follow-up question to this one. Say if I have two variables: one that is a proportion and one that is nominal. is there a test for significance that I can run where I do not specify which variable is the dependent variable? Something like ANOVA, but when one of the variables is a proportion.
    Thanks in advance for any thoughts!

    Comment


    • #3
      How many categories does the nominal variable have? If only 2, my initial impulse is to run pwcorr. Or even ANOVA or ttest. I suppose there might be some sort of violation of assumptions if you do that -- you could compare it to the p value you get from your glm model to see if it matters much. My experience is that T-tests seem to work fine even when assumptions are violated (e.g. the dependent variable is a dichotomy), at least when the sample is big, but you should do some double-checks to see if if seems to be true in your case.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://academicweb.nd.edu/~rwilliam/

      Comment


      • #4
        Thank you Richard. The nominal variable has 5 categories.

        Comment


        • #5
          Unless somebody has a better idea, I would probably use glm with my proportion as the dependent variable and the nominal variable as independent, and then report the F or chi-square value and/or the p-value for it. I'd be curious to see if the results were very different from what Anova gave you. Maybe somebody else knows how to do exactly what you originally asked, but I don't.

          Then again, I am not sure I would do this at all. Why not just report p values in your full model rather than a bunch of bivariate p values?
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          StataNow Version: 19.5 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://academicweb.nd.edu/~rwilliam/

          Comment


          • #6
            Hi Richard, thanks a lot for taking the time to reply. I like the idea of reporting the overall value for glm. I think I was not originally clearly about my problem originally. Essentially, I'm thinking of summarizing the data as follows:

            Group A 0.55
            Group B 0.62
            Group C 0.79
            Group D 0.84
            Group E 0.79

            The raw data are proportions. So for example, .55 is the MEAN proportion across everyone in group A.

            I would like to see whether there are overall statistically significant differences between the 5 groups, and then compare the different categories with each other, for example see if Group A is statistically significantly lower than Group E.

            Comment


            • #7
              That is easy enough to do. Adapting the earlier UCLA example,

              Code:
              use http://www.ats.ucla.edu/stat/stata/faq/proportion, clear
              gen ed = round(parented)
              glm meals i.ed , link(logit) family(binomial) robust nolog
              testparm i.ed
              pwcompare ed, pv
              Look at the help for pwcompare and contrast, as there are different ways to do contrasts (e.g. you could do contrasts with the grand mean or adjacent categories), and you might want to do a bonferroni adjustment or something like that given that you are mass-producing contrasts.

              Hopefully you have Stata 13, as I don't remember when the pwcompare and contrast commands were introduced. Here is the output in case your version of Stata does not support the abve commands:

              Code:
              . use http://www.ats.ucla.edu/stat/stata/faq/proportion, clear
              
              . 
              . gen ed = round(parented)
              (164 missing values generated)
              
              . 
              . glm meals i.ed , link(logit) family(binomial) robust nolog
              note: meals has noninteger values
              
              Generalized linear models                          No. of obs      =      4257
              Optimization     : ML                              Residual df     =      4252
                                                                 Scale parameter =         1
              Deviance         =  810.5198708                    (1/df) Deviance =  .1906209
              Pearson          =  801.6671472                    (1/df) Pearson  =  .1885388
              
              Variance function: V(u) = u*(1-u/1)                [Binomial]
              Link function    : g(u) = ln(u/(1-u))              [Logit]
              
                                                                 AIC             =  .8199845
              Log pseudolikelihood = -1740.336979                BIC             = -34720.55
              
              ------------------------------------------------------------------------------
                           |               Robust
                     meals |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                        ed |
                        2  |  -.9727155    .155438    -6.26   0.000    -1.277368   -.6680626
                        3  |  -2.571115    .154565   -16.63   0.000    -2.874056   -2.268173
                        4  |  -4.228895   .1611694   -26.24   0.000    -4.544781   -3.913009
                        5  |  -4.752488   .4585684   -10.36   0.000    -5.651266   -3.853711
                           |
                     _cons |    2.25915   .1532424    14.74   0.000     1.958801      2.5595
              ------------------------------------------------------------------------------
              
              . 
              . testparm i.ed
              
               ( 1)  [meals]2.ed = 0
               ( 2)  [meals]3.ed = 0
               ( 3)  [meals]4.ed = 0
               ( 4)  [meals]5.ed = 0
              
                         chi2(  4) = 4447.14
                       Prob > chi2 =    0.0000
              
              . 
              . pwcompare ed, pv
              
              Pairwise comparisons of marginal linear predictions
              
              Margins      : asbalanced
              
              -----------------------------------------------------
                           |                            Unadjusted
                           |   Contrast   Std. Err.      z    P>|z|
              -------------+---------------------------------------
              meals        |
                        ed |
                   2 vs 1  |  -.9727155    .155438    -6.26   0.000
                   3 vs 1  |  -2.571115    .154565   -16.63   0.000
                   4 vs 1  |  -4.228895   .1611694   -26.24   0.000
                   5 vs 1  |  -4.752488   .4585684   -10.36   0.000
                   3 vs 2  |  -1.598399   .0329367   -48.53   0.000
                   4 vs 2  |  -3.256179   .0563035   -57.83   0.000
                   5 vs 2  |  -3.779773    .432989    -8.73   0.000
                   4 vs 3  |   -1.65778   .0538464   -30.79   0.000
                   5 vs 3  |  -2.181374   .4326764    -5.04   0.000
                   5 vs 4  |  -.5235932   .4350794    -1.20   0.229
              -----------------------------------------------------
              If you are stuck with some horribly primitive version of Stata, the simplest thing might be to just keep rerunning the glm and changing the reference category each time. You will get all the pairwise contrasts that way. You could also use test commands, e.g.

              test 2.ed = 3.ed
              Last edited by Richard Williams; 22 Jun 2014, 20:22. Reason: Wrong code was posted before.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://academicweb.nd.edu/~rwilliam/

              Comment


              • #8
                Thank you so much, Richard! I will try this.

                Comment


                • #9
                  Hope it works. Incidentally, Statalist etiquette is to use real names. I don't think we have ever banned somebody who refused to do so, but people who do use real names are probably more likely to get help. I also think there can be some professional benefit in that people can become more aware of you and your work. If you want to change your user id you can write to the forum administrators; or, you can keep your id but attach a signature, like I do with my messages.
                  -------------------------------------------
                  Richard Williams, Notre Dame Dept of Sociology
                  StataNow Version: 19.5 MP (2 processor)

                  EMAIL: [email protected]
                  WWW: https://academicweb.nd.edu/~rwilliam/

                  Comment


                  • #10
                    Thanks a lot, Richard. I've emailed the administrators to change my ID to my full name. Apologies for not knowing about that initially.

                    Comment

                    Working...
                    X