Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Longitudinal Factor Analysis

    Dear members,

    I am currently writing my master thesis and attempting to create a composite index variable which measures 'sustainable tourism'. Factor analysis seems like a good method to use, but I'm having difficulty doing this with longitudinal data. My data set consists of 110 countries, 5 observable variables (expected to create 1 factor) and 5 years (non-consecutive: 2008, 2009, 2011, 2013, and 2015). I use version 14.0.

    I have been able to perform factor analysis per year using -factor- (i.e. 5 seperate times) but this results in 5 unique sets of factor loadings. As a result, the index scores per country are not comparable over multiple years, since the loadings used to create the index differ each period. I'd prefer to perform factor analysis on the full data set, i.e. on longitudinal data, because this would incorporate the time element of the data. I have been searching the internet for solutions (-gllamm- package or dynamic factor analyis by Frederici) but have been unsuccessful so far. So my questions are:

    - Is there method for this in Stata?
    - If not, is it possible/allowed to use the average of the individual factor loadings?

    I hope I have provided you with the necessary information, but if not, please let me know.
    Thank you in advance for your help, it is greatly appreciated!

    Kind regards,
    Joelle

  • #2
    I don't know about the details,but I would have thought gllamm would be able to do this as it is a general framework for (generalized) latent models - and the factors and a random coefficient of time (if that is how you choose to deal with longitudinal nature) are both 'latent' variables.
    However -gsem- should also be able to do this (indeed even -sem- if you can get your data in wide form (balanced)) and it will presumably be quicker, and more intuitive to use. Even simpler, you could have each time point as a fixed effect.

    Comment


    • #3
      Hi Matthew,

      Thanks for your response!

      I've looked into the -gllamm- and -gsem- syntax again, but i'm afraid I don't understand how you get factor loadings from these commands.
      I have tried the following:

      gllamm StringEnv EPI HealthCap GovPrior SustInd, i(Panel)

      gsem (ST <- StringEnv EPI HealthCap GovPrior SustInd)

      I'm probably interpreting the models incorrectly, but they seem to me to have two stages, first finding the latent variable and then using this in regression. I would only need the first step, to create a latent variable and find the corresponding factor loadings.

      Do you know which commands I would need to do this?

      Thanks

      Comment


      • #4
        The syntax for gsem/sem is pretty straightforward. A single factor measurement model is about the most basic there is - sem (x1 x2 x3 x4 <- X) is literally example 1 in the 46 model examples they give in the manual. The coefficients on the X for x1-x4 are the factor loadings. I suspect you are overthinking how to apply your data. You didn't mention before about using the latent variable in regression. That said, it is possible (and easy) with -sem- to have the latent variable 'in between' some indicators for the latent variable and an outcome variable, which you wish to regress on the latent variable, and anything else you wish.

        The longitudinal aspect definitely introducess a complication, but examples 38g-42g show how to add random effects. It *should* be straightforward, I think. It might be helpful to read through the examples and some of the other sem basics in the manual.

        Comment


        • #5
          You can use the predict postestimation command in SEM and GSEM to predict the values of the latent variable.

          Code:
          sem (ST <- StringEnv EPI HealthCap GovPrior SustInd)
          predict st, latent(ST)
          I believe you can then use that latent variable in a regression if you want. but, as Matthew said, if you want to just find the corresponding factor loadings, then the coefficients you get from SEM are the factor loadings. The full SEM manual is rather formidable, true, but these topics are covered in the examples there.
          Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

          When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

          Comment


          • #6
            Hi Mathew and Weiwen,

            Now I understand it much better! I didn't realize the coefficients were already the factor loadings.
            And I will study the examples you mention, with regards to the time aspect.

            Thanks a lot for your help!

            Joelle

            Comment


            • #7
              All the best. This is quite an ambitious master's project! Don't feel bad if you're unable to incorporate the effect of time, like if this were a latent growth curve model, but I think you're on the right track.
              Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

              When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Comment

              Working...
              X