Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Test for trend in proportions - which procedure and what nomenclature to use

    I want to make a test for trend over three lavels of age and a dichotomous outcome. I understand that I can use regress, logistic, nptrend or ptrend (among many alternatives). If I use regress or logistic - what p-value should I report (the p-value after the variable that is tested?) and what does it stand for - what "test" have I done?

    If I use nptrend - how should I describe what test I have done? The help file says the nonparametric test for trend across ordered groups developed by Cuzick (1985), which is an extension of the Wilcoxon rank-sum test, which is a bit long. And the http://www.stata.com/support/faqs/statistics/test-for-trend/ say nptrend is
    Pearson’s correlation coefficient.

    It is unclear to me how to report what statistical test I have used. And which method is recommended?

    Roland

  • #2
    Do you plan to consider the three levels of age as equally spaced? Do you plan to weight by sample sizes of each of the three levels (versus as-balanced)?

    With a dichotomous outcome, you've got choices as to what is the trend is of: proportions, odds, log-odds, general location. Your post's title indicates that you're interested in a trend of proportions, yet you mention logistic regression and nonparametric methods.

    I've seen trend used to mean "the first (linear) component of a set of orthogonal contrasts", and as "A <= B <= C, with at least one strict inequality", among others. What meaning of "trend" makes most sense for your problem?

    I guess that what I'm trying to say is that what you should do depends upon the details of the question that you're trying to answer. So, I recommend putting aside the statistical test for the moment, and going back to the science in order to first formulate your question precisely. The choice of statistical test and how to report it should be clearer then. If you focus on the myriad statistical tests for trend at first (see the do-file below for a few examples), you're liable to be tempted to choose one for the wrong reasons.
    Code:
    clear *
    set more off
    set seed `=date("2014-09-26", "YMD")'
    
    input byte age_grp int total
    20  90
    40 270
    75  30
    end
    label define AgeGroups 20 "<30 y.o." 40 "30-59 y.o." 75 "60+ y.o."
    label values age_grp AgeGroups
    
    generate int count1 = rbinomial(total, _n * 0.075 + 0.1)
    generate int count0 = total - count1
    quietly reshape long count, i(age_grp) j(response)
    
    // Trend in odds (score test)
    tabodds response age_grp [fweight=count]
    
    // Trend in log-odds (Wald test)
    logit response i.age_grp [fweight=count], nolog
    contrast qw.age_grp, noeffects
    contrast q.age_grp, noeffects
    contrast pw.age_grp, noeffects
    contrast p.age_grp, noeffects
    
    // Trend in proportion (Wald test)
    vwls response i.age_grp [fweight=count]
    
    // or:
    glm response i.age_grp [fweight=count], family(binomial) link(identity) nolog
    contrast qw.age_grp, noeffects
    contrast q.age_grp, noeffects
    contrast pw.age_grp, noeffects
    contrast p.age_grp, noeffects
    
    // Location
    quietly expand count
    nptrend response, by(age_grp)
    
    exit

    Comment


    • #3
      Hello Sir,

      I am a new user of stata software...Could you please assist me?

      Regards,
      Sruthi N R

      Comment

      Working...
      X