Converting EFA and Cluster Analysis Data to Multi-Dimensional Scaling in Stata

Jose Tubilleja

Join Date: Aug 2016

Posts: 13
#1

Converting EFA and Cluster Analysis Data to Multi-Dimensional Scaling in Stata

01 Sep 2016, 01:49

I originally posted this in the Sandbox and was advised to post it here. I appreciate any help. Here is my first posting:

I am doing research on factors of student success using a 40-question survey. The survey is composed of 5-point Likert scale questions. I will be using the same dataset for Exploratory Factor Analysis with factor rotation (EFA), K-Cluster Analysis (CA), and Multi-dimensional Scaling (MDS). I have already completed the design in Stata for the EFA and CA, and have successfully done a practice run with dummy data.

I am now having difficulties with converting the same dataset (of factors) for use with MDS. Basically, I am looking to start with creating a 40X40 correlation coefficient matrix, then square the results. I know the "correlate / pwcorr" command and the squaring separately, but I just can't find the appropriate combination of commands for doing that and then pass it on to the "mds" command for the MDS solution and map. Is this the proper way or am I going in the wrong direction?

Thank you so much for any help.

Jose
Tags: None
Jose Tubilleja

Join Date: Aug 2016

Posts: 13
#2

01 Sep 2016, 01:55

I tried the following approach and it worked. But it's the long process. I'm still missing the Stata command to convert the values in the correlation coefficient matrix to squares. That way, I will not have to export it to Excel, square the values there, and use the revised Excel as the new dataset for the Multidimensional Scaling. Please see below (Stata commands in blue):
.corr item1-item40

select the entire matrix, right-click and "Copy Table"

Paste unto Excel and square all the values there, save as a new dataset

import excel using c:\newDataset.xlsx

mds item1-item40, id(item) method(modern) dim(2) nolog

I got my MDS solution and map, but I'd rather have a Stata command for steps 2 - 4, instead of the roundabout way with Excel. ]
I'd appreciate help in that regard.

Jose
Comment

Oded Mcdossi

Join Date: Jun 2014
Posts: 577

01 Sep 2016, 01:56

I think you should walk along the line of: matrix dissimilarity command and mdsmat. I hope you'll find the following example helpfull.

Code:

//Create fake data
clear*
set obs 100
forval i=1/40 {
    g var`i'= floor((5)*runiform() + 1)
}

//BEGIN HERE

//Create a correlation matrix

mat dissim correlation=var*, variables correlation //create correlation matrix

//square the matrix (element by element)

mat sq_correlation = correlation 
forval i = 1/`= rowsof(correlation)' { 
    forval j = 1/`= colsof(correlation)' { 
        mat sq_correlation[`i', `j'] = (correlation[`i', `j'])^2
    }
} 

// Use the matrix as input for MDS

mdsmat sq_correlation, method(modern) s2d(oneminus) nolog

Comment

Jose Tubilleja

Join Date: Aug 2016

Posts: 13
#4

01 Sep 2016, 02:36

Thank you, Oded! It makes perfect sense. I'll try it both using your fake data and my dataset, and will report here the result.
Comment
Jose Tubilleja

Join Date: Aug 2016

Posts: 13
#5

01 Sep 2016, 10:36

It worked! I just imported my Excel dataset, changed the loop variable from "var*" to "item*", and voila!
Thank you so much.
Comment

Jose Tubilleja

Join Date: Aug 2016
Posts: 13

01 Sep 2016, 10:57

Here is the complete code:

Code:

*Import dataset from Excel




import excel using c:\temp\DatasetCA.xlsx, firstrow sheet(Sheet1) clear




*Create a correlation matrix




mat dissim correlation=item*, variables correlation




*square the matrix (element by element)




mat sq_correlation = correlation
forval i = 1/`= rowsof(correlation)' {
    forval j = 1/`= colsof(correlation)' {
        mat sq_correlation[`i', `j'] = (correlation[`i', `j'])^2
    }
}




* Use the matrix as input for MDS




mdsmat sq_correlation, method(modern) s2d(oneminus) nolog

Comment

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#7

02 Sep 2016, 11:04

Are you aware that Stata has a set of mds procedures that work quite happily off the original data?
Comment
Jose Tubilleja

Join Date: Aug 2016

Posts: 13
#8

04 Sep 2016, 21:23

Thanks, Phil. I am vaguely aware of that, but I've lost the procedure. I am stumped after I make the correlation matrix. Could you kindly refresh my mind and show me the code?

Thank you!

Jose
Comment

Jose Tubilleja

Join Date: Aug 2016
Posts: 13

11 Sep 2016, 00:44

Based on Phil's suggestion, I changed tack and went for MDS straight from the original data. Here's the code, it worked, but I'd appreciate any comments. Thanks.

Code:

* STEP 1
* Load data from Excel.
import excel using c:\temp\DatasetCA.xlsx, firstrow sheet(Sheet1) clear

* STEP 2
* Perform MDS.
mds item1-item40, id(case) method(modern) dim(2) nolog

* STEP 3
* Review stress and add increase dim if necessary until
*  a reasonable solution is found.  Usually dim(2) or dim(3) is enough.

* STEP 4
* Rerun MDS with correct dim().

* STEP 5
* Create Shepard diagram to visualize model fit.
mdsshepard, msize(vsmall) scheme(s1color) name(JRTshepard, replace)

* STEP 6
* Create formatted MDS Map.
mdsconfig, msize(vsmall) scheme(s1color) name(JTmds, replace)

* STEP 7
* Save dimension scores -- but change number to match number of dims.
predict dim1-dim2, config

* STEP 10
* List dimension scores for cases
list case dim*, noobs sep(0)

Comment

Oded Mcdossi

Join Date: Jun 2014

Posts: 577
#10

11 Sep 2016, 02:33

Dear Jose Tubilleja could you please be more clear about the kind of comments you wish to get?
Note that the model proposed in #2 it totally different from the model in #9. The model in 2# is a MDS for variables and in 9# is MDS for observations with regard to variables. The MDS in 2# is very similar to factor analysis but with MDS approach.
Comment

Announcement

Converting EFA and Cluster Analysis Data to Multi-Dimensional Scaling in Stata

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment