Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster

    Hi All,

    I am trying to create cluster in Stata 16. I tried doing this on two variables, but one at a time. Stata issued an error message "insufficient memory for ClusterMatrix".

    My study is on income based adolescent obesity. I tried to cluster by geo, which is rural and urban; then by gender. I also left it to Stata to choose variables and I continue getting the same error.

    My command is:

    Code:
     cluster wardslinkage geo, measure (L2)
    Thanks for your assistance

    Nthato

  • #2
    Nthato:
    can't you use the -group- function from -egen- to create different clusters of observations?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      cluster for cluster analysis depends on comparing every observation with every other observation. That puts limits on the size of a problem that can be tackled as Stata uses a matrix along the way.

      Cluster analysis is possible with one variable, but you might as well look to see whether there are breaks in a histogram. That's usually futile: if you see breaks in a small sample, they are often just sampling quirks, while if you see (genuine) breaks in a large sample they should be things you know about any way or side effects of outliers or other long-tailed behaviour.

      If geo is just an indicator for urban or rural, your clusters are already defined as the two distinct values of the variable.

      It seems to me that you just want to compare two frequency distributions here. It's not a cluster analysis problem. Depending on how many data points you have, what you know about, and who will read your report, you might need (for example) histograms, density estimations, dot plots, or quantile plots.
      Last edited by Nick Cox; 02 Jun 2022, 03:29.

      Comment


      • #4
        Thank you Carlo and Nick for your responses.

        I needed clusters to perform conindex, as it is one of the variables required. Nick you are correct, I was also wondering how to cluster when I've already grouped my variables. Thanks for clarifying why its not working. I tried kmeans and it worked. I hope it will not give a problem when I run conindex

        Thanks again gentlemen.
        Nthato

        Comment


        • #5
          Sorry, but I don't know anything about conindex, which I guess to be a community-contributed command (from where? please see FAQ Advice #12). So I can't follow #4 beyond guessing -- once again -- that classification can't seriously help with your data. Obesity is a measured variable. If you get classes at all out of such a variable it's because there are a few individuals who have very high values -- or just possibly very low values. Conversely, if you have categorical predictors already, that's fine but there is nothing to classify.

          Comment

          Working...
          X