Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate scores from a bifactor (bi-factor) model OR how to generate scores from a SEM model?

    I am comparing the scores from a few factor models. So far I generate scores after conducting an orthogonal and an oblique rotation and those seem "fine". I recently heard about the bifactor model, which could be a good fit for my dataset.

    Note that I am not familiar with the SEM (structural equation modelling) techniques and syntax and also far from comfortable with all the nuances when conducting a CFA (confirmatory factor analysis), so any generic advice in those regards are welcome.

    So, my main questions are:

    1) Is the following the correct syntax for a bifactor modeling?
    2) Are there any restrictions I have to put in place?
    3) After estimating the model, is it possible to generate scores for each factor (similar to those after running the "factor" command)?

    Note that with the current syntax, the model does not converge (in my dataset as well).


    Code:
    use https://www.stata-press.com/data/r18/audiometric, clear
    
    factor lft* rght*, factors(2) pcf
    rotate, ortho
    cap drop fc_pcf_1 fc_pcf_2
    predict fc_pcf_1 fc_pcf_2, b
    
    factor lft* rght*, factors(2)
    rotate, oblique promax(3)
    cap drop fct_obli_1 fct_obli_2
    predict fct_obli_1 fct_obli_2, bart
    
    corr fc_pcf_?
    corr fct_obli_?
    
    sem ( Left -> (lft*) ) ///
        ( Right -> (rght*) ) ///
        ( Bifct -> (lft* rght*) ), var(Left@1 Right@1)

    Click image for larger version

Name:	bifactor.png
Views:	1
Size:	74.3 KB
ID:	1736394


    Diagram of the bifactor model (from: https://www.frontiersin.org/articles...020.01357/full)


    Very broadly, in my use case, I have 12 variables (6 mental and 6 physical) and I aim to check which scores are more predictive of diagnoses, and if a single factor is good enough or if having two factors is worth any added complexity.
    I got aware of the bifactor from https://onlinelibrary.wiley.com/doi/10.1002/wps.21097, and would like to incorporate it in my analysis, although I don't really agree with their conclusion, but that is for another topic.
    Last edited by Marc Rain; 07 Dec 2023, 11:07.

  • #2
    I would strongly suggest that you familiarize yourself with the varieties and complexities of measurement modeling (EFA, CFA) before going much further. This is a difficult area of statistics. A nice introductory paper aimed at applied researchers is here. It was published in Personality Disorders: Theory, Research, and Treatment. There you will find a lot of references to get you started. The bifactor model, in particular, is an area of much controversy and discussion. The work of Steven Reise and colleagues is particularly noteworthy.

    In terms of your questions:

    1. No. Bifactor models specify 0 covariance between the "general" factor and any "specific" factors. The "specific" factors can be allowed to covary, but often researchers specify a diagonal covariance matrix for the latent variables in these models.

    2. Yes. See above. The Stata syntax to do this is the option covstructure(e._LEn _LEx,diagonal)in the sem statement.

    3. Yes. See help sem_predict

    Comment


    • #3
      Note that the Watts et al. replication materials are available here and the supporting information is available here.
      http://publicationslist.org/eric.melse

      Comment


      • #4
        Marc, in addition to Erik's concern about the bifactor model, I would like to refer you to this PsyArchiv Preprint:
        Fried, E. I. (2023, July 9). No evidence for the existence of the d (isease) factor. https://doi.org/10.31234/osf.io/47avw
        in which the Brandt at al. World Psychiatry paper, that you refer to in #1, is discussed and their argumentation, frankly, is repudiated by Eiko Fried: 'the authors did not discover the d factor: they created it.'
        http://publicationslist.org/eric.melse

        Comment


        • #5
          Thank you, Erik and Eric, for your helpful suggestions. I'll definitely have to dig a bit in the literature.

          I was able to run the model with the following syntax. Note that it only converges when using the option "difficult".
          Code:
          sem ( Left -> (lft*) ) ///
              ( Right -> (rght*) ) ///
              ( Bifct -> (lft* rght*) ), covstructure(e._LEn _LEx,diagonal) difficult
          I am still exploring the options, but after a quick validation of calculating the average area under ROC curve after running a simple logit model of selected health diagnoses on different health scores (sorted lower to higher).
          Code:
          Mental health diagnoses (migraine depression sleep dementia )
          
          sc_mean_p1           = .615
          sc_mean_mcs_ortho    = .632
          sc_mean_mcs_def      = .651
          sc_mean_mcs_obli     = .671
          sc_mean_mcs_sep      = .701
          sc_mean_mcs_rmean    = .703
          sc_mean_p0           = .726
          
          Physical health diagnoses (diabetes cardio stroke jointpain backpain)
          
          sc_mean_p2           = .542
          sc_mean_pcs_sgl      = .715
          sc_mean_p0           = .720
          sc_mean_pcs_rmean    = .748
          sc_mean_pcs_sep      = .749
          sc_mean_pcs_obli     = .757
          sc_mean_pcs_ortho    = .757
          sc_mean_pcs_def      = .761
          
          * _def: Default method (SF12)
          * _obli: Oblique rotation (promax(2))
          * _ortho: Alternative orthogonal rotation
          * _rmean: Simple row mean
          * _sep: Two single factors loading only on each domain
          * _p1: First specific factor (of bifactor)
          * _p2: Second specific factor (of bifactor)
          * _p0: General factor (of bifactor)
          So it looks like the general factor (p0) works relatively well in both domains, physical and mental, though in the physical domain there are other factors performing better. Their corresponding specific factors are not very predictive (but I might be doing something wrong here). It is worth noting that the simple row average of input variables also performs surprisingly well. Another option performing well are the two separated factors.

          Having said all that, since I do want to be able to compare the predictiveness of physical and mental health on other socio-economic outcomes, I might leave the bifactor aside and pick a solution that performs relatively well in both domains.

          -------------------------

          Regarding the Brandt at al. World Psychiatry, I also don't agree with their conclusion, but the idea of the bifactor model looked promising. In my setting, I was previously just "blindly" using the Mental and Physical Component Summaries based on the SF-12 methodology. It happens, as already explored in the literature, that there are some issues due to the orthogonal rotation used in this methodology (see Taft et al, 2001, and others). That's why I stared looking for some alternative. And that's also why I deeply appreciate your suggestions.


          Refs:
          - Taft et al (2001) Do SF-36 summary component scores accurately summarize subscale scores?

          Comment


          • #6
            By the way, these are the types of artifacts I was finding with the default SF12 method of generating the scores based on an orthogonal rotation. With the oblique rotation, I don't get those, at least not as strongly. The MCS works well for "predicting" depression, but the PCS shows some distortions at extremes. The same happens, but to a lesser extent, with MCS when predicting a "physical health" diagnosis.

            Click image for larger version

Name:	def_arts.png
Views:	1
Size:	36.5 KB
ID:	1736581


            Click image for larger version

Name:	obli_arts.png
Views:	1
Size:	45.2 KB
ID:	1736582


            These are simple binscatter plots of "being diagnosed with depressive disorder" on each score, respectively.
            Last edited by Marc Rain; 09 Dec 2023, 07:49.

            Comment

            Working...
            X