Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Group based trajectory modelling on unbalanced panel data in long format - traj command

    Thanks in advance for help - I have unbalanced panel data in long format (repeated body mass index-BMI, taken at different ages) for ~100,000 individuals. I'm seeking to examining the trajectory of weight/BMI gain with age, identify different trajectories and examine factors associated with the different trajectories. I've been struggling to find if the traj function - group-based trajectory model is suitable for this. Specifically for unbalanced data and if it can be performed with data in long format. Guidance would be appreciated.

  • #2
    Welcome to Statalist.

    From within Stata 15.1, the command search traj returns no information about a traj command or function. Nor does search trajectory.

    Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Section 12.1 is particularly pertinent

    12.1 What to say about your commands and your problem
    ...
    If you are using community-contributed (also known as user-written) commands, explain that and say where they came from: the Stata Journal, SSC, or other archives. This helps (often crucially) in explaining your precise problem, and it alerts readers to commands that may be interesting or useful to them.

    Comment


    • #3
      Originally posted by Shailen Sutaria View Post
      Thanks in advance for help - I have unbalanced panel data in long format (repeated body mass index-BMI, taken at different ages) for ~100,000 individuals. I'm seeking to examining the trajectory of weight/BMI gain with age, identify different trajectories and examine factors associated with the different trajectories. I've been struggling to find if the traj function - group-based trajectory model is suitable for this. Specifically for unbalanced data and if it can be performed with data in long format. Guidance would be appreciated.
      I Googled the term group-based trajectory model, and it appears to relate to latent class models. Stata's latent class model command doesn't appear to support distal outcomes, unfortunately (I could be wrong!). However, that's a pretty advanced technique. It's likely that a simpler hierarchical linear model would suffice. You'd incorporate a random intercept for each person, and if you're interested in trajectories, you'd incorporate a random slope for time. Type
      Code:
      help mixed
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #4
        Thanks for replies. Its a stata plugin - from http://www.andrew.cmu.edu/user/bjones/traj, more details https://pdfs.semanticscholar.org/fc5...ac48fdc430.pdf for estimating group based trajectory models (which I believe is a type of latent class models - as you say). The simpler hierachical linear (or cubic/quadratic) models - growth curve models - where I've incorporated random incept and random slope - have the limitation of assuming one trajectory - where as this technique has the advantage of identifying distinct clusters of trajectories using the data. Hence for my purposes - this would be useful.

        Comment


        • #5
          Originally posted by Shailen Sutaria View Post
          Thanks for replies. Its a stata plugin - from http://www.andrew.cmu.edu/user/bjones/traj, more details https://pdfs.semanticscholar.org/fc5...ac48fdc430.pdf for estimating group based trajectory models (which I believe is a type of latent class models - as you say). The simpler hierachical linear (or cubic/quadratic) models - growth curve models - where I've incorporated random incept and random slope - have the limitation of assuming one trajectory - where as this technique has the advantage of identifying distinct clusters of trajectories using the data. Hence for my purposes - this would be useful.
          Very interesting plugin. I wasn't aware of it. From casual inspection, this would require wide format. You probably already know that you can reshape long to wide data using the native reshape command.

          The plugin appears to be able to handle observations "missing at random", and observations not missing at random through a time-lagged model for dropout probability. I put missing at random in quotes because I'm not clear if they meant missing completely at random, and missing at random conditional on observed covariates in the model (which is the modal definition of MAR that I'm familiar with). But it does sound like the plugin can handle your unbalanced panel, and you'll have to reshape your data to accommodate it.

          I can't speak to the theoretical basis for this package, being merely an applied statistician. But it does look like the authors covered the bases they needed to. This package supports a limited number of models, but the censored normal model would probably work for you - just specify a lower limit of 0 (BMI can't go below 0) and an arbitrarily high upper limit. The package was written before Stata natively implemented latent class/profile models and finite mixture models. From what I can tell, the native LCA command won't support a distal outcome, which is a shortcoming I hope that Stata can address. It doesn't appear that the finite mixture model commands will support longitudinal analysis. So, maybe this is a shortcoming that Stata needs to address. Also, I have seen latent class analysis with distal outcomes that vary among classes done in MPlus, which really specializes in this sort of work.

          I have to push back on one technical point: models with random slopes don't assume one trajectory. They do allow every person to have their own trajectory, and they do estimate that trajectory. And if you read through example 6 (heteroskedastic random effects) in the mixed command manual, there is a framework for seeing if the random slopes/intercepts differ on the basis of observable covariates. So, if this will do for your needs, it might be worth considering!
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Thanks for guidance. Points taken. Agree regarding technical point. I meant it assumes one trajectory for your population, where as other technique seems to identify subgroups within your population that differ significantly in their trajectory - which would be useful for my purposes. Unfortunately don't have much knowledge of MPlus - but from reading - does seem the useful for this type of work.

            Comment

            Working...
            X