Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ordered logit LCA Model with gsem

    Dear statalist,

    I try to run an LCA Model with gsem using four variables (three are binary, one is ordered with four categories). None of the variables has a striking slope distribution or any outliers.
    My sample contains 1020 observations.
    When I use the command stated below stata displays the error message "initial values not feasible".

    This is my code:

    Code:
    gsem (sit imsty_rc remit partner <-), logit lclass(C 3) vsquish nodvheader noheader nolog
    Is this a problem I can solve with a faster PCU or indicates the error message that there are problems with my data? As I couldn't find problems in the data by checking the distributions and looking for outliers what are other possible sources for this error?

    Best,
    Franziska

  • #2
    First, you are actually treating all the indicators as binary. This code corrects that, although I'm taking a guess as to which variable is ordinal.

    Code:
    gsem (sit remit partner <- logit) (imsty_rc <- ologit), lclass(C 3) vsquish nodvheader noheader nolog
    Assuming no data errors (which I'd recommend checking), I haven't encountered this error message in the LCA context very frequently. Basically, Stata estimates a set of initial parameter values, then it uses its EM algorithm to iterate to a better solution from there, then it switches to the usual quasi-Newton method for iteration after a set number of EM iterations. Stata apparently couldn't iterate past its chosen initial parameter values. I don't know why this could be. However, I believe this code will override the default start values:

    Code:
    gsem (sit remit partner <- logit) (imsty_rc <- ologit), lclass(C 3) vsquish nodvheader noheader nolog startvalues(randomid, draw(20)) emopts(iterate(10))
    Here, Stata randomly assigns each observation to one class, then I think it iterates to the maximum likelihood solution from the starting average values of each class. It will repeat this process 20 times, then choose the best solution (i.e. after the 10 EM iterations, it will choose the solution with the highest log likelihood; note that by default the EM algorithm runs for 20 iterations unless it converges earlier, but the EM algorithm is relatively slow).

    Give the first set of code a shot without further modification and see if your problem disappears. If not, try adding the next set of code.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Dear Weiwen,

      thank you so much for your help.
      You were right with your first suggestion. The code you provided treats "logit" and "ologit" as a variable. As I guess that's not what you have intended I fixed the problem with ordinary and binary variables by using "mlogit" as an option.
      Your second suggestion seems to be the solution I was looking for. Stata is running faster now and comes to a result in fact.

      Now my code is:
      Code:
      gsem (sit remit partner imsty_rc <-),  mlogit lclass(C 3) vsquish nodvheader noheader nolog startvalues(randomid, draw(20)) emopts(iterate(10))
      Last edited by Franziska Spanner; 25 Sep 2018, 02:04.

      Comment


      • #4
        Originally posted by Franziska Spanner View Post
        Dear Weiwen,

        thank you so much for your help.
        You were right with your first suggestion. The code you provided treats "logit" and "ologit" as a variable. As I guess that's not what you have intended I fixed the problem with ordinary and binary variables by using "mlogit" as an option.
        Your second suggestion seems to be the solution I was looking for. Stata is running faster now and comes to a result in fact.

        Now my code is:
        Code:
        gsem (sit remit partner imsty_rc <-), mlogit lclass(C 3) vsquish nodvheader noheader nolog startvalues(randomid, draw(20)) emopts(iterate(10))
        Franziska,

        My code is wrong because it omitted commas. It should look something like:

        Code:
        gsem (sit remit partner <-, logit) (imsty_rc <-, ologit), lclass(C 3) vsquish nodvheader noheader nolog startvalues(randomid, draw(20)) emopts(iterate(10))
        As you clearly know, if you use -mlogit- on a binary variable, that's the same as specifying -logit-. Nonetheless, the code above is the one that actually enables you to use different models on some variables. You can use any statistical model supported by -gsem-, e,g, if you had continuous variables you could use -regress-, -poisson-, or -negbin-.

        I hope this further assists you in your goal. Do note that as you increase the number of latent classes, it will probably help to increase the number of random start value draws (i.e. increase the bolded number 20 above). Other programs conduct as many as 100 draws from random starting values, as far as I know.
        Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

        When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

        Comment

        Working...
        X