Stumped: CFA not converging

Zach Rodgers

Join Date: Feb 2016

Posts: 11
#1

Stumped: CFA not converging

22 Mar 2018, 04:50

Hello Community:

I'm having issues getting a simple CFA model to converge: sem (LATENT -> observed1 observed2)

Sem gets stuck on a particular log likelihood value "(not concave)". After many iterations it concludes "Warning: convergence not achieved."

observed1 is a binary variable
observed2 is a categorical (ordinal) variable, from 1-5

I've tried standardizing (z-score) the variables, but it doesn't seem to make a difference. I've also tried dropping data with missing values. That doesn't work either. And the chapter 12 https://www.stata.com/manuals13/semintro12.pdf methods didn't work either. Only one approach has worked so far, but it is so elaborate that I no longer trust the results: using the "stand" and "difficult" and "init" options all combined.

What could be wrong that is causing it not to converge?

Many Thanks for your assistance!
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

22 Mar 2018, 04:56

Show us your data as

Code:

tabulate observed1 observed2 tabulate observed1 observed2, missing

Expecting a linear composite to underlie that may be optimistic.

z-scores are just a linear rescaling and won't make the data any more suitable for this than they were originally.
Comment
Zach Rodgers

Join Date: Feb 2016

Posts: 11
#3

22 Mar 2018, 05:39

Thanks for the speedy response! I am trying to replicate a Factor variable conducted in other research with similar data to mine.

Here are the data:

| observed2
observed1 | 1 2 3 4 5 | Total
-----------+-------------------------------------------------------+----------
0 No | 1,087 9 971 2,349 386 | 4,802
1 Yes | 1,770 35 1,607 3,665 747 | 7,824
-----------+-------------------------------------------------------+----------
Total | 2,857 44 2,578 6,014 1,133 | 12,626

. tab observed1 observed2, missing

| observed2
observed1 | 1 2 3 4 5 . | Total
-----------+------------------------------------------------------------------+----------
0 No | 1,087 9 971 2,349 386 9 | 4,811
1 Yes | 1,770 35 1,607 3,665 747 48 | 7,872
. | 67 0 195 131 0 29 | 422
-----------+------------------------------------------------------------------+----------
Total | 2,924 44 2,773 6,145 1,133 86 | 13,105

Last edited by Zach Rodgers; 22 Mar 2018, 05:44.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35724

22 Mar 2018, 06:10

Thanks for sharing the data. In essence, your two variables don't have enough in common to make a composite summary interesting or useful. They are practically uncorrelated. I don't know much at all about structural equation models, but a PCA shows this without fuss and indeed the correlation that is nearly zero predicts the futility of either exercise.

Small tips for anyone watching: pushing a small table of counts through tabi if there are no data in memory has the useful side-effect of leaving behind the equivalent dataset as new data when you finish. Also, Zach's tables are clear enough but a graph (tabplot from the Stata Journal) perhaps makes the lack of correlation (also lack of association) clearer.

For your wider project, whatever it is, both variables could be helpful but there is little or no scope for, and no gain from, searching for a composite, whether it's called latent or is just a pragmatic linear combination. That's my diagnosis.

Code:

. clear 

. tabi 1087 9 971 2349 386 \ 1770 35 1607 3665 747 

           |                          col
       row |         1          2          3          4          5 |     Total
-----------+-------------------------------------------------------+----------
         1 |     1,087          9        971      2,349        386 |     4,802 
         2 |     1,770         35      1,607      3,665        747 |     7,824 
-----------+-------------------------------------------------------+----------
     Total |     2,857         44      2,578      6,014      1,133 |    12,626 

          Pearson chi2(4) =  16.1575   Pr = 0.003

. 
. replace row  = row - 1
(10 real changes made)

. rename (row col) (observed1 observed2)

. expand pop 
(12,616 observations created)

. corr observed* 
(obs=12,626)

             | observ~1 observ~2
-------------+------------------
   observed1 |   1.0000
   observed2 |   0.0026   1.0000


. pca observed*

Principal components/correlation                 Number of obs    =     12,626
                                                 Number of comp.  =          2
                                                 Trace            =          2
    Rotation: (unrotated = principal)            Rho              =     1.0000

    --------------------------------------------------------------------------
       Component |   Eigenvalue   Difference         Proportion   Cumulative
    -------------+------------------------------------------------------------
           Comp1 |      1.00265    .00529494             0.5013       0.5013
           Comp2 |      .997353            .             0.4987       1.0000
    --------------------------------------------------------------------------

Principal components (eigenvectors) 

    ------------------------------------------------
        Variable |    Comp1     Comp2 | Unexplained 
    -------------+--------------------+-------------
       observed1 |   0.7071    0.7071 |           0 
       observed2 |   0.7071   -0.7071 |           0 
    ------------------------------------------------

. tabplot observed* , showval bfcolor(none)

Click image for larger version

Name: nothingdoing.png
Views: 1
Size: 14.8 KB
ID: 1435655

Last edited by Nick Cox; 22 Mar 2018, 06:27.

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35724
#5

22 Mar 2018, 10:45

I don't know what these variables are, but there is a hint that grade 2 on observed2 is very unusual. Why would people (?) avoid it or not admit it?
Comment
Zach Rodgers

Join Date: Feb 2016

Posts: 11
#6

23 Mar 2018, 04:34

Awesome, Thanks so much for the insight, Nick! Very helpful. I'll see if I can reach the authors of the studies that used my variables as a factor.

I thought that a CFA would still converge for poor or even zero correlation, but with poor evaluative values (RMSEA, etc.). But it sounds like poor correlation can actually prevent convergence entirely?

Is it possible that the authors of the other study put enough other variables in the model that observed1 and observed2 correlation emerged? I'm assuming that the correlation is independently calculated. If non-zero correlation is a pre-condition of CFA model convergence, it's hard for me to imagine how this could have been calculated.
Comment
Zach Rodgers

Join Date: Feb 2016

Posts: 11
#7

23 Mar 2018, 04:37

P.S. Funny how much more intuitive it is to view tabplot than the raw table outputs.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#8

23 Mar 2018, 04:47

I haven't studied the internals of sem but my guess is that it's using a very general algorithm and you're presenting it with an extreme case. There is almost nothing to find in your example and the command is having a hard time finding it.

I really can't speculate helpfully about another study and the other variables in that study given zero information on either. But in principle it is possible that conditionally on other variables more structure could be found.

My prejudice is that the fancier the models in a sub-field, the less skill and attention is often devoted to looking at the data carefully, wrongly dismissed or even deprecated as too elementary to engage many researchers. Conversely, although you have named nothing this looks loosely like social survey data with a big sample, so weak relationships are typical, yet the sample size allows significance at conventional levels.

It's salutary that the chi-square test in #4 shows a P-value many researchers would be delighted to see, but the association is very weak. Yet again, if the variables were named, it is possible that a slight difference between groups is exactly what would make sense to researchers and they just want to quantify it.

Last edited by Nick Cox; 23 Mar 2018, 05:20.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#9

23 Mar 2018, 04:48

#7 Not funny to me! I've devoted a fair fraction of my career to trying to get people to look at their data....
Comment

Announcement