Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Latent class analysis: STATA vs SAS

    Dear Stata users,

    I hope this post is not off topic (if so, I am sorry). I had implemented a latent class analysis using the PROC LCA, which is a SAS procedure. The analysis is a based on a questionnaire, addressed to a sample of visual artists, composed by 10 yes-no question. I did the same analysis with the new Latent Class Analysis feature available in Stata 15 (-gsem command). Surprisingly, I obtained different results between the two softwares. In order to understand which is the reason for this discrepancy, I run in SAS a latent class analysis using the simple example dataset provided by STATA (example 50 g in the manual: https://www.stata.com/manuals/sem.pdf ). Again, I obtained different results concerning the probabilities of the different binary items, even if the interpretation of the resulting 2 classes are quite similar.

    Which is the reason for this discrepancy? Is it possible that the probabilities resulting by the two software have a different interpretation? SAS report directly the item-response probabilities conditional on latent class membership. STATA reports the estimated coefficients of the multinomial logit and then, through the command -estat lcmean - the marginal predicted means of each item within each latent class, which I assume equal to the item-response probabilities calculated by SAS (but they aren’t).

    Can anyone enlighten me on this? Thank you

  • #2
    I believe that PROC LCA is not a stock SAS procedure, but it's a user-written command courtesy of the Penn State University Methodology Center. That aside, the differing results may stem from different maximization criteria. The next few paragraphs explain more thoroughly.

    That aside, I believe the answer may lie in different convergence criteria. As you may know, in almost all estimation routines, we try to maximize the likelihood function, often by iteratively approximating the likelihood function using some variant of Newton-Raphson. This implies a convergence criterion: if the value of the log-likelihood changes by less than a set amount from iteration to iteration, which is sort of equivalent to the first derivative being essentially zero, the software will declare convergence. Stata's criterion is something like a change of less than 1 * 10^-7 (I'll write this as 1e-7). Other software may set this at a different level, or they may use the same criterion.

    Stata also applies a second criteria. I'm very much approximating this explanation in English, because I don't understand the underlying math. However, Stata also requires that the second derivative of the log-likelihood be less than a certain amount (I think the default is 1e-5). I am unclear if other software apply this criterion.

    I think the rationale for the second criterion is to reduce the chance of declaring convergence at a maxima that is local but not global. For many likelihood functions, this is irrelevant. For complex problems like latent class analysis, it becomes very relevant. I am not sure if PROC LTA or its equivalent Stata version impose the second criterion, but I am pretty sure they have an option to apply multiple random start values, maximize the likelihood from there, then report the highest log likelihood at which there was consistent convergence. This is well be PROC LTA's default. It is definitely not Stata's default, but you can make Stata impose a similar routine. By coincidence, this poster ran into what is probably the same issue.

    Likelihood maximization in latent class models is pretty complex due to the phenomena of local maxima. From my general reading, and from some testing I've done using the PSU Stata plugin, it seems like Stata's maximization criteria might differ from the plurality of other software that fit latent class models. In my work, I've used the random draws option (in post #6 at my link) with 4 latent classes or above. It may be worth using it with as few as 2 or 3 classes.

    To the rest of your questions, I haven't used PROC LTA. Semantically, it seems like the SAS item-response probabilities should equal -estat lcmean-. -estat lcprob- is the predicted prevalence of each latent class in Stata. The item-response probabilities reported by each software should have similar interpretations. Do note that the order in which classes are reported from software to software, or even in the same software if you try to replicate your model, aren't guaranteed to be the same. For example, class #1 in SAS could differ from class #1 in Stata. Perhaps you already know this, but it can certainly result in some confusion. Stata has no easy facility to re-order your latent classes; if you want, I can give a solution that has usually worked for me.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Thank you so much Weiwen for your reply! Very useful and clear. My doubt was whether there is a different interpretation of the parameters estimated (but as you have confirmed, the SAS-item response probabilities should equal Stata marginal mean probabilities) or something else. The different convergence criteria is a reasonable explanation

      Comment


      • #4
        Andrea Baldin
        I would not expect the item probabilities to be equal or even comparable to the observed means for the latent classes in Stata. What I would expect to be comparable would be the estimated parameters in the two models (e.g., SAS item probabilities and Stata coefficients). That said, one potential source of differences could be the way that the models are parameterized in the software. Stata tends to approach these types of models from the Generalized Linear Latent and Mixed Model parameterization. If you fit an IRT model in Stata there are transformations needed to present the parameter estimates in the same form that an IRT parameterized Model would present them. In the SEM/IRT manual entries they describe the difference in the parameterization as an implementation of the slope/intercept parameterization.

        The means that you would use after fitting an LCA/LTA are used to provide some type of context which can be used to understand what is different about each of the groups based on each record’s estimated class membership.

        Comment

        Working...
        X