Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting EFA and Cluster Analysis Data to Multi-Dimensional Scaling in Stata

    I originally posted this in the Sandbox and was advised to post it here. I appreciate any help. Here is my first posting:

    I am doing research on factors of student success using a 40-question survey. The survey is composed of 5-point Likert scale questions. I will be using the same dataset for Exploratory Factor Analysis with factor rotation (EFA), K-Cluster Analysis (CA), and Multi-dimensional Scaling (MDS). I have already completed the design in Stata for the EFA and CA, and have successfully done a practice run with dummy data.

    I am now having difficulties with converting the same dataset (of factors) for use with MDS. Basically, I am looking to start with creating a 40X40 correlation coefficient matrix, then square the results. I know the "correlate / pwcorr" command and the squaring separately, but I just can't find the appropriate combination of commands for doing that and then pass it on to the "mds" command for the MDS solution and map. Is this the proper way or am I going in the wrong direction?

    Thank you so much for any help.

    Jose

  • #2
    I tried the following approach and it worked. But it's the long process. I'm still missing the Stata command to convert the values in the correlation coefficient matrix to squares. That way, I will not have to export it to Excel, square the values there, and use the revised Excel as the new dataset for the Multidimensional Scaling. Please see below (Stata commands in blue):
    1. .corr item1-item40
    2. select the entire matrix, right-click and "Copy Table"
    3. Paste unto Excel and square all the values there, save as a new dataset
    4. import excel using c:\newDataset.xlsx
    5. mds item1-item40, id(item) method(modern) dim(2) nolog
    I got my MDS solution and map, but I'd rather have a Stata command for steps 2 - 4, instead of the roundabout way with Excel. ]
    I'd appreciate help in that regard.

    Jose

    Comment


    • #3
      I think you should walk along the line of: matrix dissimilarity command and mdsmat. I hope you'll find the following example helpfull.

      Code:
      //Create fake data
      clear*
      set obs 100
      forval i=1/40 {
          g var`i'= floor((5)*runiform() + 1)
      }
      
      //BEGIN HERE
      
      //Create a correlation matrix
      
      mat dissim correlation=var*, variables correlation //create correlation matrix
      
      //square the matrix (element by element)
      
      mat sq_correlation = correlation 
      forval i = 1/`= rowsof(correlation)' { 
          forval j = 1/`= colsof(correlation)' { 
              mat sq_correlation[`i', `j'] = (correlation[`i', `j'])^2
          }
      } 
      
      // Use the matrix as input for MDS
      
      mdsmat sq_correlation, method(modern) s2d(oneminus) nolog

      Comment


      • #4
        Thank you, Oded! It makes perfect sense. I'll try it both using your fake data and my dataset, and will report here the result.

        Comment


        • #5
          It worked! I just imported my Excel dataset, changed the loop variable from "var*" to "item*", and voila!
          Thank you so much.

          Comment


          • #6
            Here is the complete code:
            Code:
            *Import dataset from Excel
            
            
            
            
            import excel using c:\temp\DatasetCA.xlsx, firstrow sheet(Sheet1) clear
            
            
            
            
            *Create a correlation matrix
            
            
            
            
            mat dissim correlation=item*, variables correlation
            
            
            
            
            *square the matrix (element by element)
            
            
            
            
            mat sq_correlation = correlation
            forval i = 1/`= rowsof(correlation)' {
                forval j = 1/`= colsof(correlation)' {
                    mat sq_correlation[`i', `j'] = (correlation[`i', `j'])^2
                }
            }
            
            
            
            
            * Use the matrix as input for MDS
            
            
            
            
            mdsmat sq_correlation, method(modern) s2d(oneminus) nolog

            Comment


            • #7
              Are you aware that Stata has a set of mds procedures that work quite happily off the original data?

              Comment


              • #8
                Thanks, Phil. I am vaguely aware of that, but I've lost the procedure. I am stumped after I make the correlation matrix. Could you kindly refresh my mind and show me the code?

                Thank you!

                Jose

                Comment


                • #9
                  Based on Phil's suggestion, I changed tack and went for MDS straight from the original data. Here's the code, it worked, but I'd appreciate any comments. Thanks.

                  Code:
                  * STEP 1
                  * Load data from Excel.
                  import excel using c:\temp\DatasetCA.xlsx, firstrow sheet(Sheet1) clear
                  
                  * STEP 2
                  * Perform MDS.
                  mds item1-item40, id(case) method(modern) dim(2) nolog
                  
                  * STEP 3
                  * Review stress and add increase dim if necessary until
                  *  a reasonable solution is found.  Usually dim(2) or dim(3) is enough.
                  
                  * STEP 4
                  * Rerun MDS with correct dim().
                  
                  * STEP 5
                  * Create Shepard diagram to visualize model fit.
                  mdsshepard, msize(vsmall) scheme(s1color) name(JRTshepard, replace)
                  
                  * STEP 6
                  * Create formatted MDS Map.
                  mdsconfig, msize(vsmall) scheme(s1color) name(JTmds, replace)
                  
                  * STEP 7
                  * Save dimension scores -- but change number to match number of dims.
                  predict dim1-dim2, config
                  
                  * STEP 10
                  * List dimension scores for cases
                  list case dim*, noobs sep(0)

                  Comment


                  • #10
                    Dear Jose Tubilleja could you please be more clear about the kind of comments you wish to get?
                    Note that the model proposed in #2 it totally different from the model in #9. The model in 2# is a MDS for variables and in 9# is MDS for observations with regard to variables. The MDS in 2# is very similar to factor analysis but with MDS approach.

                    Comment

                    Working...
                    X