Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems when running optimal k-means cluster solution program

    Dear Team,

    After reading the excellent "Stata tip 110: How to get the optimal k-means cluster solution, Stata Journal (2012) 12, Number 2, pp. 347-351" from Anna Makles I copied and paste the code written on the paper. The STATA do file is:

    PHP Code:
    use physedclear
    local list1 
    " flexibility speed strength "
    foreach v of varlist `list1´ {
    egen z_
    `v´ std(`v´)
    local list2 "z_flexibility z_speed z_strength"
    forvalues k = 1(1)20 {
    cluster kmeans 
    `list2´k(`k´) start(random(123)) name(cs`)
    }
    WSS matrix
    matrix WSS 
    J(20,5,.)
    matrix colnames WSS k WSS log(WSSeta-squared PRE
    WSS for each clustering
    forvalues k 
    1(1)20 {
    scalar ws`k´ = 0
    foreach v of varlist 
    `list2´ {
    quietly anova `v´ cs`
    scalar ws
    `k´ = ws`k´ e(rss)
    }
    matrix WSS[`k´, 1] = `
    matrix WSS
    [`k´, 2] = ws`
    matrix WSS
    [`k´, 3] = log(ws`)
    matrix WSS[`k´, 4] = 1 - ws`/WSS[1,2]
    matrix WSS[`k´, 5] = (WSS[`-1,2] - ws`k´)/WSS[`-1,2]
    }
    matrix list WSS
    local squared 
    char(178)
    _matplot WSScolumns(2 1connect(lxlabel(#10) name(plot1, replace) nodraw noname
    _matplot WSScolumns(3 1connect(lxlabel(#10) name(plot2, replace) nodraw noname
    _matplot WSScolumns(4 1connect(lxlabel(#10) name(plot3, replace) nodraw noname ytitle({&eta}`squared´)
    _matplot WSScolumns(5 1connect(lxlabel(#10) name(plot4, replace) nodraw noname
    graph combine plot1 plot2 plot3 plot4name(plot1to4replace
    But I obtain no table nor graph and the system crashes and I have to restart the computer. I repeated the run changing " ` " and " ´ " by " ' " and using other dataset with same results. I 'm user of STATA 15.0.

    Any idea about what is happening?

    Thank you very much.
    Jorge

  • #2
    I started reading your code and got as far as

    Code:
    use physed, clear
    local list1 " flexibility speed strength "
    foreach v of varlist `list1´ {
         egen z_`v´ = std(`v´)
    local list2 "z_flexibility z_speed z_strength"
    where I added indents.

    That looks wrong to me I think you missed out a line closing the loop.

    Code:
    use physed, clear
    local list1 " flexibility speed strength "
    foreach v of varlist `list1´ {
         egen z_`v´ = std(`v´)
    }
    local list2 "z_flexibility z_speed z_strength"
    At the same time, I am not clear that this is behind your problem. However, your attempted fix of replacing left single quotes with standard single quotes is indeed quite wrong.

    Without your dataset, we can't try to run this code ourselves.

    Comment


    • #3
      Thank you Nick. This is the result with the change:

      PHP Code:
      use physedclear


      local list1 " flexibility speed strength "


      . foreach 
      v of varlist `list1´ {
        2. 
      . egen z_
      `v´ std(`v´)
        3. 
      . }
      invalid name
      r
      (198); 
      On the other hand the dataset was obtained from http://www.stata-press.com/data/r15/physed.dta

      Comment


      • #4
        I didn't notice earlier but you have the wrong quote characters there.


        Code:
        use physed, clear
        local list1 " flexibility speed strength "
        foreach v of varlist `list1' {
             egen z_`v' = std(`v')
        }
        local list2 "z_flexibility z_speed z_strength"
        The closing quote characters show as upright quotes here:

        Code:
        . di char(39)
        '
        
        . di uchar(39)
        '

        Comment


        • #5
          Jorge, let me assure you that you did nothing "wrong" - or rather, you did something that's been a continual source of posts like yours to Statalist.

          You are yet another person caught by StataCorp's editorial habit of using — in their PDFs and in the Stata Journal — the typographical left and right "single quotes"
          Code:
          ‘v’
          rather than the "accent grave" and "apostrophe" characters
          Code:
          `v'
          users must type for Stata to recognize them.

          Doing that enhances the typographical elegance of the publications while increasing the difficulty of copying and pasting code from PDF publications into Stata. This editorial habit reflects the days of Stata 1.0 when copying and pasting from documentation meant using real paste, but to me is less defensible in the era of paperless documentation.

          Luckily the output of the help command does not suffer from this problem.
          Last edited by William Lisowski; 31 Jan 2019, 10:20.

          Comment


          • #6
            Dear Nick and William, now it works using `V'.
            Thank you very much for your help.

            Comment


            • #7
              Dear all, its my first time posting a question (I have been applying lots of tips you guys have solved). In advance, I would like to say I am a huge fan of Mr. Nick Cox (I have learned a lot due its comments).
              I am introducing my question in this topic since it can be part of the topic created by Mr. Jorge Ruiz Moreno, regarding problems when running optimal k-means cluster solution program.
              I am trying to use the same commands from the paper published by Dr. Anna Makles entitled "Stata tip 110: How to get the optimal k-means cluster solution, Stata Journal (2012) 12, Number 2, pp. 347-351".
              Like Mr. Jorge, after the contributions of Mr. Nick Cox and Mr. William Lisowski, the command used is executed to the last line, however, before the graphs are estimated, the following fault appears:
              "unexpected end of file"
              r (612)
              Could you help me to solve this question?

              The command applied was the follows:

              . use http://www.stata-press.com/data/r12/physed, clear

              . local list1 " flex speed strength "

              . foreach v of varlist `list1´ {
              2. egen z_`v' = std(`v')
              3. local list2 "flex speed strength"
              4. forvalues k = 1(1)20 {
              5. cluster kmeans `list2´, k(`k') start(random(123)) name(cs`k')
              6. }
              7. * WSS matrix
              . matrix WSS = J(20,5,.)
              8. matrix colnames WSS = k WSS log(WSS) eta-squared PRE
              9. * WSS for each clustering
              . forvalues k = 1(1)20 {
              10. scalar ws`k' = 0
              11. foreach v of varlist `list2´ {
              12. quietly anova `v' cs`k'
              13. scalar ws`k' = ws`k' + e(rss)
              14. }
              15. matrix WSS[`k', 1] = `k'
              16. matrix WSS[`k', 2] = ws`k'
              17. matrix WSS[`k', 3] = log(ws`k')
              18. matrix WSS[`k', 4] = 1 - ws`k'/WSS[1,2]
              19. matrix WSS[`k', 5] = (WSS[`k'-1,2] - ws`k')/WSS[`k'-1,2]
              20. }
              21. matrix list WSS
              22. local squared = char(178)
              23. _matplot WSS, columns(2 1) connect(l) xlabel(#10) name(plot1, replace) nodraw noname
              24. _matplot WSS, columns(3 1) connect(l) xlabel(#10) name(plot2, replace) nodraw noname
              25. _matplot WSS, columns(4 1) connect(l) xlabel(#10) name(plot3, replace) nodraw noname ytitle({&eta}`squared')
              26. _matplot WSS, columns(5 1) connect(l) xlabel(#10) name(plot4, replace) nodraw noname
              27. graph combine plot1 plot2 plot3 plot4, name(plot1to4, replace)
              28.
              unexpected end of file
              r(612);

              end of do-file

              I am using STATA 13.0

              Comment

              Working...
              X