Finite Mixture Model FMM initial value of mixing proportions and standard deviation of components

ericmelse

Join Date: May 2014

Posts: 434
#1

Finite Mixture Model FMM initial value of mixing proportions and standard deviation of components

26 May 2018, 04:57

Dear Stata listers,

There are many options to control a Maximization of Finite Mixture Model (FMM), but I am somewhat lost to understand if it is possible to:
- control the initial value of mixing proportions;
- control the standard deviation of components?

Suppose I want to exercise the example in the manual:

Code:

use http://www.stata-press.com/data/r15/stamp fmm 3: regress thickness

but control the initial value of mixing proportions of components to: (.50,.25,.25)
and control the standard deviation of components to: (3,4,4)
which option can be used and how are the parameters then set?
I suspect the option startvalues() is to be used but I am lost how to proceed.

My objective is to be able to cycle through a series of different settings, save the BIC of each run, and then look for the best possible solution.

http://publicationslist.org/eric.melse
Tags: None

Weiwen Ng

Join Date: Jun 2015
Posts: 1241

26 May 2018, 07:23

In general, you can do that using the -from- option. One clarification: when you want to set the initial "standard deviation of components to: (3,4,4)", do you mean the variance of the error terms of the components?

For the class proportion, you'll have to calculate the appropriate multinomial logistic intercepts (_cons below) to iterate from. My algebra is failing me this morning and we are leaving for a camping trip, so I will leave that to others. For the error variance, you can modify the start values directly.

Code:

use http://www.stata-press.com/data/r15/stamp
fmm 3, emopts(iterate(0)) noestimate: regress thickness
estimates table

---------------------------
    Variable |   active    
-------------+-------------
1b.Class     |
       _cons |  (omitted)  
-------------+-------------
2.Class      |
       _cons |  3.324e-16  
-------------+-------------
3.Class      |
       _cons | -.00619197  
-------------+-------------
thickness    |
       Class |
          1  |  .07253086  
          2  |  .08088272  
          3  |   .1047764  
-------------+-------------
var(e.thic~s)|
       Class |
          1  |  8.298e-06  
          2  |  8.708e-06  
          3  |   .0000946  
---------------------------


matrix b = e(b)
matrix list b
b[1,9]
         1b.Class:       2.Class:       3.Class:     thickness:     thickness:
                o.                                           1.             2.
            _cons          _cons          _cons          Class          Class
y1              0      3.324e-16     -.00619197      .07253086      .08088272

        thickness:             /:             /:             /:
                3. var(e.thic~s)  var(e.thic~s)  var(e.thic~s)
            Class        1.Class        2.Class        3.Class
y1       .1047764      8.298e-06      8.708e-06       .0000946


/*Now, modify the matrix entries directly, e.g.*/
matrix b[1,7] = 3
matrix b[1,8] = 3
matrix b[1,9] = 4

fmm 3, from(b): regress thickness

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

Comment

Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#3

29 May 2018, 10:27

Originally posted by Weiwen Ng View Post

...

For the class proportion, you'll have to calculate the appropriate multinomial logistic intercepts (_cons below) to iterate from. My algebra is failing me this morning and we are leaving for a camping trip, so I will leave that to others. For the error variance, you can modify the start values directly.

...

With a bit of help from Wolfram Alpha, you can derive the multinomial logistic intercepts for the corresponding probabilities. In a multinomial logistic model,

P(Y = 1) = 1 / [1 + exp(B2) + exp(B3)]
P(Y = 2) = exp(B2) / [1 + exp(B2) + exp(B3)]
P(Y = 3) = exp(B3) / [1 + exp(B2) + exp(B3)]

The correct intercepts for betas 2 and 3 are -ln(2). You would actually modify the matrix as follows:

Code:

matrix b[1,2] = -ln(2) matrix b[1,3] = -ln(2) matrix b[1,7] = sqrt(3) matrix b[1,8] = sqrt(3) matrix b[1,9] = sqrt(4)

This verifies that the start probabilities were correctly calculated:

Code:

fmm 3, from(b) emopts(iterate(0)) noestimate: regress thickness estat lcprob, nose Latent class marginal probabilities Number of obs = 485 -------------------------------------------------------------- | Margin -------------+------------------------------------------------ Class | 1 | .5 2 | .25 3 | .25 --------------------------------------------------------------

And the model converges. Comparing results with the same model without us supplying start values:

Code:

quietly fmm 3, from(b): regress thickness estimates store startval quietly fmm 3: regress thickness estimates store default estimates table default startval ---------------------------------------- Variable | default startval -------------+-------------------------- 1b.Class | _cons | (omitted) (omitted) -------------+-------------------------- 2.Class | _cons | .64106963 2.5603988 -------------+-------------------------- 3.Class | _cons | .81015376 3.1767619 -------------+-------------------------- thickness | Class | 1 | .07121827 .12497285 2 | .07860159 .10121457 3 | .0988789 .07619875 -------------+-------------------------- var(e.thic~s)| Class | 1 | 1.713e-06 .00001788 2 | 5.736e-06 .00008651 3 | .00019667 .00002156 ----------------------------------------

So, in this case, we get different and possibly wrong results. The last iteration of the model with supplied start values was backed up. Note that -di e(converged)- will return a 1, indicating convergence, but you should still double check things in this case. That said, this is how one would modify the start values if you have some prior knowledge or some substantive concern about the correct start values.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment
ericmelse

Join Date: May 2014

Posts: 434
#4

29 May 2018, 14:36

Dear Weiwen,

Thank you very much for your elaborate and helpful posts.

Another, possibly somewhat arcane, method to derive the required multinomial logistic intercepts for the corresponding probabilities to be able to assign a weight to the FMM components, after your first post in reply, I tried this weekend:

Code:

clear all set obs 100 gen prob=1 gen level=1 replace level=2 if _n>50 replace level=3 if _n>75 mlogit level prob , nocons[ * This is about the same as what you provided in your second post, * and I am happy to see it confirmed: * This verifies that the start probabilities were correctly calculated: fmm 3, from(b) emopts(iterate(0)) noestimate: regress thickness

Again, I really appreciate your explanation.

PS I was referring to (the start value of) the standard deviation of the components, and not the variance of the error terms of the components.

Last edited by ericmelse; 29 May 2018, 14:39.

http://publicationslist.org/eric.melse
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#5

30 May 2018, 06:57

Originally posted by ericmelse View Post

...

PS I was referring to (the start value of) the standard deviation of the components, and not the variance of the error terms of the components.

Eric, I am glad my response was helpful!

The topic about error variance versus variance of the components is one where I'm still learning. Because I am not fully versed in the SEM world, I honestly have no idea why it's called the error variance. In this context, though, I don't see any values for the variance of the components. I believe that the variance of the components is related to the error variance, though I can't explain why. Can anyone shed some light?

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement