Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in trajectory group percentages on trajplot and with tab command

    Hello statalisters,

    I have a question about stata's traj pluggin for group-based trajectory models (installed from here: https://www.andrew.cmu.edu/user/bjones/).

    Consider the following code:
    Code:
    use http://www.andrew.cmu.edu/user/bjones/traj/data/montreal_sim.dta, clear
    traj, model(cnorm) var(qcp*op) indep(age*) order(1 3 2) min(0) max(10)
    trajplot, xtitle("Age") ytitle("Opposition")
    The trajplot command runs a graph saying that traj 1 is 30.8%, traj 2 46.1% and traj 3 23.0%.

    Now if you run:
    Code:
    tab _traj_Group
    It gives 29.99% for traj 1, 47.25% for traj 2 and 22.76% for traj 3, which is different from above. Not a lot but it is disturbing (and even worse in my own dataset).
    Manually calculating the percentages using the numbers given by the tab command obviously gave me the same percentages as the tab command.

    Do you get different percentages too? Does somebody know why? Are calculations on the plot not using the same numbers or formula? What did I miss here?

    Thank you in advance for your help.

    Maud

    PS I use Stata/SE 14.2 for Mac.


  • #2
    Maud, I did get different percentages. However, I think this is expected. In latent class or finite mixture modeling (and this command appears to be a subset of finite mixture modeling), you are estimating the probability that each observation falls into each of the latent classes. We can either keep this probabilistic assignment to classes, or we can assign people to their most likely class or modal class: the class that they had the highest predicted probability of being in.

    I wasn't able to find this detail in their documentation, but the plugin creates multiple variables. There appears to be one variable with the prefix _traj_ProbGk, where k denotes which class. Summarize those probabilities, and I am very confident they line up with what is displayed on the plot. Those are your model-based probabilities.

    The variable you identified would very likely be the modal class indicator. And yes, it is definitely expected that the modal class assignment probability will not be exactly in line with the model-based assignment probabilities. Some classification diagnostics for latent profile models in general are based on how much discrepancy exists between the modal class probabilities and the model-based ones.

    One of the basic ones is entropy, which a few of us discussed on this thread. I've generally heard that entropy should not be used as a diagnostic index of fit, i.e. it shouldn't be used to select the number of classes/trajectories, but it can be used to demonstrate how well-separated the classes are. I've heard values of over 0.7 suggested as a 'good' amount of separation. Kathryn Masyn discusses diagnostic issues in her excellent chapter on finite mixture modeling (it's cited in the Stata version 15 SEM manual in the example on latent profile modeling). I believe that this plugin basically does latent profile modeling on a set of longitudinal measurements, using each measurement as an indicator. I think that Stata 15 users interested in re-creating this example could just run a latent profile model from the SEM code.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Hi Weiwen, thanks a lot for your answer, I will definitely have a look at your links.

      Comment


      • #4
        Hi,
        Can anyone please provide some guidance on the data preparation for the group based trajectory model? I could not find any documentation on how to prepare the data and detail explanation on running the models. Thanks in advance.

        Comment


        • #5
          Originally posted by Baker Chowdhury View Post
          Hi,
          Can anyone please provide some guidance on the data preparation for the group based trajectory model? I could not find any documentation on how to prepare the data and detail explanation on running the models. Thanks in advance.
          I believe the data need to be shaped wide, which is not the usual way Stata prefers to handle data. Try

          Code:
          help reshape
          for some examples of the command. If you need further assistance, it's better to start a new topic so that it can be found more easily by searching. This helps future users with the same problem.
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment

          Working...
          X