Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Can anyone inform me when one would choose a different calculation method for the starting values?
    The default method is factor, with randomid, randompr, and jitter being alternative methods.
    Moreover, what method would mimic the method used in MPlus?

    Comment


    • #17
      Originally posted by Martijn Hogerbrugge View Post
      Can anyone inform me when one would choose a different calculation method for the starting values?
      The default method is factor, with randomid, randompr, and jitter being alternative methods.
      Moreover, what method would mimic the method used in MPlus?
      I do not know what the default method does, to be honest.

      With -randomid-, each observation is randomly assigned to one class. With -randompr-, I believe that each observation is randomly assigned a probability of being in each latent class. In the former case, we might say, let's imagine Mrs Jansen belongs to class 1, and Mrs De Jong belongs to class 2, and maximize from there. In the latter case, we might say that Mrs Jansen has a 60% probability of being in class 1, and Mrs De Jong has a 20% probability; let's maximize from there.

      My understanding is that Stata calculates start values based on the means in each class - which are randomly generated, remember. I could be wrong on the specifics.

      My understanding is that -jitter- randomly generates start values. To be honest, they didn't explain this very well in the documentation.

      I don't use MPlus. I think I've glanced through some parts of their documentation. I would guess that -jitter- is probably how they generate random start values, but I'm not sure; if you can point us to documentation where they describe how they generate start values, that would be helpful.

      I honestly can't say for sure if a specific method is better. I think that you want to make sure that Stata explores a wide range of start values. I would think that any of the methods above would do it. That said, I have never tested this empirically. This might be worth a simulation study, come to think of it.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment


      • #18
        Maybe the Stata team can provide additional help in their manual or on their support page?
        Information on the start values generated in MPlus can be found here: http://www.statmodel.com/download/Starts.pdf

        Comment


        • #19
          Originally posted by Martijn Hogerbrugge View Post
          Maybe the Stata team can provide additional help in their manual or on their support page?
          Information on the start values generated in MPlus can be found here: http://www.statmodel.com/download/Starts.pdf
          My training is in applied stats, so I'm not a pure statistician. That said, when I read that page, it seems clear that MPlus is taking some start value and perturbing it with something like a uniform random number. When I read Stata's description of -jitter-, it sounds more like perturbing the start values with a normal random number, although you can specify the variance of that number. So, at first glance, I would guess from the description that MPlus perturbs their start values more widely than Stata, but I can't prove this. I had some correspondence with Stata before on latent class analysis, and I think they knew that the description of -jitter- was a bit imprecise.

          What follows is just me talking idly. Do note that if you are really determined, you could inspect the start values by specifying a couple options:

          Code:
          gsem (x1 x2 x3 <-, logit), lclass(C 2) startvalues(jitter(1 2), draws(1)) emopts(iterate(0)) noestimate
          matrix b = e(b)
          Not tested, but this should tell Stata to a) make only 1 draw, b) don't estimate the model, and c) don't make any EM iterations either (I think I tried this once, and if you don't specify 0 EM iterations, it will iterate the EM algorithm despite you saying noestimate). You could then repeatedly run this in a loop and use estimates store to store them, or something like that.

          Also note that if you are really, really determined, you can feed Stata your own matrix of parameter start values. If you saved the e(b) matrix after the above (the second line of code), you could manipulate it to your heart's content.
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #20
            Does anyone with some experience in latent class analysis in Stata share possible ways of dealing with violation of local independence assumption. Does incorporating a covariance structure for all error variables associated with observed endogenous variables in your model like
            Code:
            covstructure(e._OEn, unstructured)
            solve the problem?

            Comment


            • #21
              Originally posted by Jeff Pitblado (StataCorp) View Post
              The thorough researcher might use these results as starting values in a refit with additional Mplus style model identifying constraints.

              Note that in the following I use estimates store to store the results from both model fits for direct comparison using estimates table.

              Code:
              estimates store nonrtol
              matrix b0 = e(b)
              
              gsem (SmokedBefore13 ///
              DailySmoke ///
              DroveDrunk ///
              DrankBefore13 ///
              BingeDrink ///
              MarijuanaBefore13 ///
              CocaineEver ///
              GlueEver ///
              MethEver ///
              EcstasyEver ///
              SexBefore13 ///
              ManyPartners <- ) ///
              (4: SmokedBefore <- _cons@15) ///
              (4: CocaineEver <- _cons@-15) ///
              (5: CocaineEver <- _cons@-15) ///
              (3: GlueEver <- _cons@-15) ///
              (3: MethEver <- _cons@-15) ///
              (4: MethEver <- _cons@-15) ///
              (5: MethEver <- _cons@-15) ///
              (5: EcstasyEver <- _cons@-15) ///
              (3: ManyPartners <- _cons@15) ///
              , ///
              nocapslatent ///
              logit ///
              lclass(C 5) ///
              em(iter(0)) ///
              nodvheader ///
              from(b0)
              estimates store withCns
              
              estimates table nonrtol withCns, stat(ll)
              Jeff Pitblado (StataCorp) Just wanted to sincerely thank you! Your detailed advice have helped me a lot.

              Comment

              Working...
              X