Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • GLM for unbalanced design

    Dear all,

    I'm comparing a count dependent var (negative binomial distribution) between three treatments.

    Each treat has different n. Data is not clustered/collapsed (1 row=1 patient)

    Is the GLM model automatically accounting for differences in n between the groups, or should I specify it in the model..?

    Apologize if it is a very basic questions.. I found a few "weight" options but seems that it is used in block data.

    Thank you so much for your time


  • #2
    You don't state which Stata commands you will be using (or have used) for estimating these models, but certainly with the official commands -glm-, -poisson-, or -nbreg-, it does not matter that you have a different sample size in each treatment group, and you don't have to modify the command in any way to accommodate it. The same is true if you are using the longitudinal or mixed-model commands -xtpoisson-, -xtnbreg-, -mepoisson-, or -menbreg-, or -meglm-.

    Comment


    • #3
      Dear Clyde,
      thank you so much for your reply

      following the code that I have used:
      xi: glm number_parasites i.treatment, family(nbinomial) l(log)

      Comment


      • #4
        Definitely no need to worry about the unequal treatment group sizes.

        Unsolicited advice: Unless you are using a very old version of Stata, avoid -xi:-. It has been almost entirely replaced by factor variable notation. Read -help fvvarlist-. If you just strip the -xi:- off the beginning of your command, what you have will be correct. (Leave the i. on treatment.) The problem with -xi- is that it blocks you from using the -margins- command following your estimation. If you want things like the marginal effect of treatment or the expected numbers of parasites in each treatment group, the easiest way to get that is with the -margins- command. Even if you don't want those things for this analysis, you will probably want to use -margins- in the future; it is one of Stata's best commands, in my opinion. And you can't use it after -xi-. So start now to almost forget you ever heard of -xi-. There are a few commands that don't allow factor variable notation, but they are mostly archaic commands whose functions have been incorporated into new commands that do support factor-variable notation. And there are some fairly exotic situations where you still need hand-generated indicator variables even so. But they are, as I say, exotic and rare.

        To learn about the -margins- command, I recommend starting with the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf. It is a crystal clear introduction to the command and has worked examples of the commonest applications. After that you can go on to the PDF documentation that comes installed with your Stata to learn about the more advanced features.

        Comment


        • #5
          Brilliant! Thank you so much, Clyde! Best wishes

          Comment

          Working...
          X