Longitudinal Factor Analysis

Joelle Duff

Join Date: Feb 2017

Posts: 3
#1

Longitudinal Factor Analysis

14 Feb 2017, 03:33

Dear members,

I am currently writing my master thesis and attempting to create a composite index variable which measures 'sustainable tourism'. Factor analysis seems like a good method to use, but I'm having difficulty doing this with longitudinal data. My data set consists of 110 countries, 5 observable variables (expected to create 1 factor) and 5 years (non-consecutive: 2008, 2009, 2011, 2013, and 2015). I use version 14.0.

I have been able to perform factor analysis per year using -factor- (i.e. 5 seperate times) but this results in 5 unique sets of factor loadings. As a result, the index scores per country are not comparable over multiple years, since the loadings used to create the index differ each period. I'd prefer to perform factor analysis on the full data set, i.e. on longitudinal data, because this would incorporate the time element of the data. I have been searching the internet for solutions (-gllamm- package or dynamic factor analyis by Frederici) but have been unsuccessful so far. So my questions are:

- Is there method for this in Stata?
- If not, is it possible/allowed to use the average of the individual factor loadings?

I hope I have provided you with the necessary information, but if not, please let me know.
Thank you in advance for your help, it is greatly appreciated!

Kind regards,
Joelle
Tags: factor analysis, Latent Variable, loading factors, longitudinal, panel data
Matthew Burnell

Join Date: Feb 2017

Posts: 19
#2

14 Feb 2017, 04:07

I don't know about the details,but I would have thought gllamm would be able to do this as it is a general framework for (generalized) latent models - and the factors and a random coefficient of time (if that is how you choose to deal with longitudinal nature) are both 'latent' variables.
However -gsem- should also be able to do this (indeed even -sem- if you can get your data in wide form (balanced)) and it will presumably be quicker, and more intuitive to use. Even simpler, you could have each time point as a fixed effect.
Comment
Joelle Duff

Join Date: Feb 2017

Posts: 3
#3

14 Feb 2017, 07:18

Hi Matthew,

Thanks for your response!

I've looked into the -gllamm- and -gsem- syntax again, but i'm afraid I don't understand how you get factor loadings from these commands.
I have tried the following:

gllamm StringEnv EPI HealthCap GovPrior SustInd, i(Panel)

gsem (ST <- StringEnv EPI HealthCap GovPrior SustInd)

I'm probably interpreting the models incorrectly, but they seem to me to have two stages, first finding the latent variable and then using this in regression. I would only need the first step, to create a latent variable and find the corresponding factor loadings.

Do you know which commands I would need to do this?

Thanks
Comment
Matthew Burnell

Join Date: Feb 2017

Posts: 19
#4

14 Feb 2017, 09:36

The syntax for gsem/sem is pretty straightforward. A single factor measurement model is about the most basic there is - sem (x1 x2 x3 x4 <- X) is literally example 1 in the 46 model examples they give in the manual. The coefficients on the X for x1-x4 are the factor loadings. I suspect you are overthinking how to apply your data. You didn't mention before about using the latent variable in regression. That said, it is possible (and easy) with -sem- to have the latent variable 'in between' some indicators for the latent variable and an outcome variable, which you wish to regress on the latent variable, and anything else you wish.

The longitudinal aspect definitely introducess a complication, but examples 38g-42g show how to add random effects. It *should* be straightforward, I think. It might be helpful to read through the examples and some of the other sem basics in the manual.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#5

14 Feb 2017, 19:58

You can use the predict postestimation command in SEM and GSEM to predict the values of the latent variable.

Code:

sem (ST <- StringEnv EPI HealthCap GovPrior SustInd) predict st, latent(ST)

I believe you can then use that latent variable in a regression if you want. but, as Matthew said, if you want to just find the corresponding factor loadings, then the coefficients you get from SEM are the factor loadings. The full SEM manual is rather formidable, true, but these topics are covered in the examples there.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Joelle Duff

Join Date: Feb 2017

Posts: 3
#6

15 Feb 2017, 03:23

Hi Mathew and Weiwen,

Now I understand it much better! I didn't realize the coefficients were already the factor loadings.
And I will study the examples you mention, with regards to the time aspect.

Thanks a lot for your help!

Joelle
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#7

15 Feb 2017, 08:26

All the best. This is quite an ambitious master's project! Don't feel bad if you're unable to incorporate the effect of time, like if this were a latent growth curve model, but I think you're on the right track.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Longitudinal Factor Analysis

Comment

Comment

Comment

Comment

Comment

Comment