Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster wardslinkage : problem of memory

    hello,

    I'm working on a data of 240000 observation (U.S banks), I try to use wardlinkage clustering for one variable, but I receive this message : insufficient memory for ClusterMatrix r(950); i maximized memory in my Mac but I receive the same message, I tried the same method on a computer in my university but I receive the same message. I thought to use wardlinkage clustering by subsamples, for example for each 30000 observation and merged them all in the end, what do you think ?

    thanks

  • #2
    If you want to classify on one variable alone, cluster analysis as usually conceived is overkill, as well as being utterly impracticable for your data set size.

    As you've described it, your problem is in effect looking for breaks in the distribution, and if they're real they will be obvious on a histogram or quantile plot.

    Otherwise the discussion at https://stats.stackexchange.com/ques...ets-e-g-income may be of interest or use.

    Comment


    • #3
      thank you Nick for your reply

      Comment

      Working...
      X