Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • FMM lcprob variables

    Hello experts,

    In FMM (finite mixture models), our main models(s) could have certain IVs. Then, I can add some variables to lcprob to specify variables that determine the probabality of being in each class. For example, Stata help document says that total medical expenditure (DV) could be predicted by gender, age, and income. In a basic model, it only uses the mentioned IVs. Later, it says, with that, we are assuming that prior probability of being in each class was the same among all individuals. However, it would make a better sense if include "total number of chronic issues" each person has in "lcprob" part of the model.

    Now, my question is: what is the criteria on the basis of which we decide one variable should be in the main model rather than in lcprob part of the model? In other words, in the mentioned example" total number of chronic issues" could have been used as one of the IVs in the model.

    I hope my question is clear.

    Thanks in advance
    Last edited by Iman Kent; 28 Dec 2018, 16:36.

  • #2
    Did I ask my question at a wrong place or something?

    Comment


    • #3
      Originally posted by Iman Kent View Post
      Did I ask my question at a wrong place or something?
      What do you mean by this?

      In general, nobody is full-time staff at Statalist. We're all independent researchers. We help each other out if and when we can. If the question is clear and somebody knows the answer, you often get a prompt response, but you aren't guaranteed to get one, and nobody is obliged to provide one. On top of that, it's near the end of the year, and people may already be on holiday or otherwise engaged. Bumping your post within 3 hours is not generally considered to be good form whatever the time of the year, because of what I said just before this sentence.

      Your question is actually a good one and it is fairly clear given that I've read the FMM manual (but note: not everyone here may be familiar with FMMs, as they are a bit of an esoteric model). I don't think there is a clear, uneqquivocal criterion, and I'll explain why.

      In finite mixture models (and latent class models, which are a subset of FMMs), we are saying that we think that f[E(Y)] = XB (to put things really generally), and that the betas vary across k latent classes. You're describing examples 1a and 1b of the FMM manual. In that example, we are assuming that the relationship of income, age, gender, and total chronic conditions to log medical expenditures varies across latent classes. In 1a, we found 3 latent classes: low, medium, and high spenders.

      In example 1b, we are considering what could predict latent class membership, treating that as a categorical variable (and thus, you are basically fitting a multinomial regression model; the covariates in the lcprob option are the predictors in that model). What could predict if you're a low, medium, or high spender? Well, total chronic conditions is one good guess. I suspect there are others.

      You may object that total chronic conditions was already used in predicting log medical expenditures in each latent class. You'd be right, but this is OK. You don't have to include the predictor of class membership in the regression on log medical expenditures, or whatever your dependent variable is. But, it seems that you can. Whatever relationship you're interested in predicting, you can and should use any predictors you can think of.

      When you are looking at predictors for which latent class each observation is in, I would guess that any predictors that make substantive sense are allowable. I guess the only thing I can really add is that you should look at your latent classes, and assess what they are, in a substantive sense. In some cases, I bet you will find that you have covariates that clearly go into the multinomial part but not into the regular regression. In this case, total chronic conditions seems to be a legitimate predictor for both the main regression on medical expenditures, and the class membership regression.

      I know that this answer is not very specific. For all I know, everything I said above was something you already knew. If so (or even if not), this might underscore why you didn't get an immediate response. There is no clear criteria. You have to use your substantive judgment.
      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

      Comment

      Working...
      X