Dear Statalist users,
I have a Stata v15 and working with two separate datasets with about 1000 observations. The data come from surveys conducted in two different countries, and most questions are exactly the same with the same response options.
My goal is to compare a latent construct that is measured using about 9 variables across the two countries. I use structural equation modelling to estimate the latent variables separately in each country, and my target variable is the second-order variable.
I use the predict command to get a score and then transform to the variable to 0-1 scale and compare means of independent samples.
However, in another post on factor scores after SEM Clyde Schechter commented that "the actual values of the factor scores are really not meaningful at all." I am hoping their mean values mean something and can be compared across different samples.
My question is whether I should run separate sems for each country or append the observations from the second country and run a single sem model to predict one latent variable and run a t-test by country.
Mathematically what would be the most accurate way to compare the means? My hypothesis is the mean of the second-order variable (GE) is significantly higher in country 1 than in country 2. When I rescale the predicted score to 0-1 scale, I do see support for my hypothesis but in the original scale where mean is set to 0 (Standardized), no difference is observed and I think it is precisely because the mean is set to be 0. So, is it ok to use the rescaled version of the predicted second-order variable to test for difference in means? Once again, I did run separate sems for each country to predict. Would you suggest instead run a single sem across the combined dataset?
I am placing some data examples from each country with all the variables used in the SEM and my commands.
Thanks for your help.
DATA EXAMPLE (COUNTRY 1)
DATA EXAMPLE (COUNTRY 2)
I have a Stata v15 and working with two separate datasets with about 1000 observations. The data come from surveys conducted in two different countries, and most questions are exactly the same with the same response options.
My goal is to compare a latent construct that is measured using about 9 variables across the two countries. I use structural equation modelling to estimate the latent variables separately in each country, and my target variable is the second-order variable.
I use the predict command to get a score and then transform to the variable to 0-1 scale and compare means of independent samples.
However, in another post on factor scores after SEM Clyde Schechter commented that "the actual values of the factor scores are really not meaningful at all." I am hoping their mean values mean something and can be compared across different samples.
My question is whether I should run separate sems for each country or append the observations from the second country and run a single sem model to predict one latent variable and run a t-test by country.
Mathematically what would be the most accurate way to compare the means? My hypothesis is the mean of the second-order variable (GE) is significantly higher in country 1 than in country 2. When I rescale the predicted score to 0-1 scale, I do see support for my hypothesis but in the original scale where mean is set to 0 (Standardized), no difference is observed and I think it is precisely because the mean is set to be 0. So, is it ok to use the rescaled version of the predicted second-order variable to test for difference in means? Once again, I did run separate sems for each country to predict. Would you suggest instead run a single sem across the combined dataset?
I am placing some data examples from each country with all the variables used in the SEM and my commands.
Thanks for your help.
DATA EXAMPLE (COUNTRY 1)
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(Pop Sm Pur Ath Clas RI AA AS RA) 2 2 1 2 3 2 1 1 1 3 1 2 3 3 1 1 1 4 3 2 1 3 2 1 1 1 3 3 2 1 3 3 3 4 2 4 2 1 3 2 3 1 1 1 3 3 2 2 3 2 3 3 1 2 2 3 2 4 3 3 2 1 3 3 3 2 3 2 2 3 2 4 3 3 3 3 3 2 4 2 4 3 1 1 2 3 2 1 1 4 3 3 1 3 3 3 1 1 4 4 2 3 4 4 3 2 2 4 3 2 2 2 3 3 2 2 3 1 1 1 4 4 1 1 1 4 3 3 3 4 2 2 1 1 4 1 1 1 2 1 3 1 1 4 2 1 1 4 3 4 1 2 3 3 4 2 4 3 3 3 2 4 2 4 2 4 3 3 1 2 3 2 3 1 3 3 4 1 1 4 4 1 3 2 4 4 2 1 4 4 3 3 4 3 3 2 1 4 3 3 2 2 3 2 2 2 2 3 2 3 3 3 3 2 2 3 2 2 1 3 2 3 2 1 3 3 3 3 4 4 3 3 3 4 2 1 2 1 4 1 2 1 4 3 3 2 3 4 1 4 2 1 2 3 4 3 3 4 3 3 3 3 1 2 3 4 2 2 2 3 2 1 3 3 3 4 2 1 2 3 4 4 4 4 4 3 2 2 3 4 3 4 3 4 4 2 4 3 4 2 2 3 3 3 4 1 3 1 2 3 3 3 2 1 3 3 2 1 4 3 4 1 1 4 4 1 1 1 1 4 1 1 3 3 3 2 2 2 3 2 1 4 2 1 1 1 4 2 1 2 4 3 2 4 3 3 4 1 2 3 2 2 3 3 3 4 2 1 4 3 4 1 4 4 3 1 2 4 3 1 1 3 3 1 1 1 4 4 3 3 3 4 2 2 2 3 3 3 2 4 3 2 3 2 4 3 1 1 2 3 2 2 1 3 2 1 2 3 1 3 2 1 3 3 1 1 3 3 3 1 1 4 2 1 1 2 4 4 1 1 4 2 2 4 4 2 4 3 1 3 3 1 2 1 3 3 2 1 3 4 3 4 1 3 1 1 1 3 3 2 2 3 3 3 2 1 3 3 2 2 3 3 3 2 1 3 3 1 2 3 3 3 2 1 4 2 2 2 4 3 2 1 1 4 3 3 4 4 4 3 3 2 3 3 1 2 3 3 4 1 3 4 3 2 2 2 3 3 2 1 3 3 3 2 3 3 3 2 2 3 4 2 2 3 3 4 2 1 3 2 1 1 3 3 1 2 1 3 4 2 2 4 4 4 1 2 4 3 2 2 4 3 3 2 2 3 2 1 1 2 3 3 4 2 3 3 2 2 3 3 3 2 1 3 3 1 2 3 4 3 3 3 3 2 1 2 3 3 3 1 1 4 1 1 1 2 1 1 1 1 1 3 1 1 3 3 4 2 1 4 1 1 1 2 1 2 1 1 1 2 2 2 2 3 2 2 1 4 3 1 1 2 2 2 2 1 3 4 1 1 3 4 4 4 1 4 4 4 2 4 3 2 4 3 4 2 3 2 4 1 3 1 1 4 2 2 2 3 3 2 3 1 3 3 2 1 3 4 4 2 3 2 3 2 2 3 3 4 2 2 3 1 1 1 3 3 2 2 1 1 3 4 2 3 3 3 2 2 3 3 1 1 3 3 2 2 2 3 3 3 4 4 3 4 4 2 4 3 1 1 3 3 1 2 2 3 3 3 3 3 3 3 3 2 4 2 2 2 2 4 3 2 2 3 1 1 1 1 2 2 1 1 2 3 2 2 3 3 2 2 1 3 3 2 2 3 3 3 2 1 3 2 2 2 2 3 2 2 2 4 3 3 2 4 3 3 2 2 4 1 1 1 1 1 2 2 2 2 3 1 2 3 3 3 2 1 3 3 3 2 3 4 4 2 1 4 3 4 1 4 1 3 1 1 2 3 1 1 3 4 3 1 1 3 2 2 1 4 3 2 4 1 4 2 2 3 4 4 2 3 2 4 3 2 1 3 2 3 2 2 4 3 4 1 4 4 3 2 3 3 end
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float( Pop Sm Pur Ath Clas RI AA AS RA) 2 1 1 1 2 2 1 1 3 2 1 1 2 1 2 3 1 1 3 1 4 2 4 1 4 1 4 2 2 2 3 3 2 2 1 3 3 1 2 3 3 4 1 1 3 3 1 4 4 3 3 2 1 4 3 2 4 3 3 3 4 1 4 3 2 2 3 3 3 2 1 3 2 3 4 4 2 4 2 1 4 2 3 3 3 3 3 3 2 2 3 1 . 3 4 3 4 2 3 1 1 1 3 3 4 2 1 3 2 2 3 2 2 2 1 1 3 3 1 4 2 4 2 4 1 2 4 3 1 3 2 1 1 1 4 3 3 3 3 3 2 2 2 3 3 2 4 3 3 3 4 1 3 2 3 1 3 3 2 2 1 2 1 1 1 1 1 3 2 1 2 3 3 2 2 3 2 2 2 4 3 1 2 3 3 3 1 1 3 3 4 3 4 3 4 2 2 4 3 3 2 3 3 4 2 3 4 3 3 1 3 2 2 1 2 4 3 2 3 4 1 4 1 1 4 1 1 1 1 1 1 1 1 3 3 1 3 3 3 4 2 1 3 3 2 3 2 3 2 1 2 4 1 1 2 3 2 3 2 2 3 4 2 1 3 4 4 1 1 4 3 1 1 1 1 2 1 1 4 3 1 2 3 2 2 1 1 4 1 1 1 3 1 1 1 1 3 3 3 3 4 4 3 4 2 4 1 1 1 2 3 3 2 1 1 4 2 2 3 3 3 4 1 4 3 1 1 4 3 1 2 1 3 2 1 1 2 3 3 2 1 4 3 2 1 3 3 4 4 2 2 1 3 4 3 3 2 2 2 2 3 2 3 2 3 4 2 1 4 3 1 2 4 4 2 2 1 4 3 1 2 1 3 2 1 2 3 3 2 2 2 3 2 1 1 3 3 1 2 4 3 2 2 1 1 3 2 2 3 3 3 3 2 4 3 2 4 3 3 4 4 2 3 3 1 2 3 4 2 2 1 3 3 1 3 3 3 3 2 1 3 2 3 2 3 4 2 3 2 3 3 1 2 3 3 3 2 1 3 2 1 1 3 1 4 4 1 1 3 2 2 4 1 4 4 4 4 3 1 2 3 2 1 2 1 4 3 1 2 3 3 3 3 1 2 2 1 3 3 2 2 4 2 4 1 1 2 3 3 3 2 1 3 3 1 1 3 3 1 2 2 3 3 2 2 3 4 2 2 1 3 3 3 4 3 3 4 1 1 4 3 4 4 2 3 4 4 4 4 4 1 1 1 3 2 4 1 4 4 4 2 4 4 4 4 2 4 3 1 2 2 2 2 2 2 3 1 3 2 4 3 4 4 1 1 1 1 1 3 1 4 1 1 4 2 2 4 2 2 4 3 1 4 3 3 2 3 3 3 2 1 3 3 1 1 1 3 3 3 1 1 2 1 2 4 1 4 4 2 4 3 3 3 3 3 3 4 3 3 3 3 4 3 3 3 4 3 3 3 2 2 3 3 3 3 2 4 3 1 1 3 3 3 2 1 3 3 1 2 3 3 3 2 1 3 3 1 2 2 3 2 1 1 3 1 3 2 3 1 1 1 1 1 3 3 2 3 2 2 1 1 3 1 1 1 3 1 1 1 1 3 3 1 1 2 3 1 2 1 3 3 1 2 1 3 2 4 1 3 3 3 3 3 1 2 2 1 4 1 1 1 1 3 1 1 1 4 2 1 1 3 3 4 2 3 3 3 3 4 4 3 4 4 2 4 3 2 1 3 1 2 4 2 4 3 2 3 1 1 2 4 2 2 3 1 1 4 3 2 2 1 4 3 1 3 3 3 3 2 1 3 4 1 1 1 3 4 1 1 4 3 1 2 2 2 4 1 1 1 3 3 3 3 1 3 3 1 3 3 3 3 4 1 4 2 1 4 3 1 1 3 2 2 2 1 4 3 2 3 3 3 4 2 1 3 1 1 1 3 1 1 1 1 3 3 1 1 1 2 3 1 1 3 3 1 2 2 2 2 2 1 4 4 1 3 3 4 4 2 1 4 3 1 1 3 1 2 2 1 3 end
Code:
sem (CoD-> Ath Sm Pur) /// (CaD -> Pop RA Clas) /// (PG -> AA AS RI) /// (GE -> CoD CaD PG) /// , difficult latent(CoD CaD PG GE) nocapslatent standardized predict CoD CaD PG GE, latent
Comment