Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • JACCARD similarity measure

    Hi I am working on similarity measure on the different assets classification. I have close to 12 assets classes for more than 50 banks spanning from 2000Q1-4 -2019Q1-4 for each assets class. I am interested in calculating the similarity measure amongst each asset class using the Jaccard measure and i do not know how, is there anyone that can help with a clue or a guide. My data sets has been reshaped wide. BANKID represents the each banks, while each assets class is defined by the quarter period.
    BANKID cash_bal2000Q1 securities2000Q1 fedfnd_revrepo2000Q1 loan_lease_hfs2000Q1
    100003 0.0523267 0.3282848 0.0035845 0.0051423
    100134 0.023778 0.2791913 0.0038878 0
    100135 0.0230704 0.2506895 0.00654 0.000572
    100144 0.1323925 0.0914672 0.0345783 0
    100154 0.0534938 0.0709148 0.0026255 0.0053682
    100161 0.0506434 0.2294578 0.0018176 0.0080809
    100165 0.0256106 0.2885543 0.0335942 0.0220141
    100173 0.0333732 0.2799982 0.0094496 0
    100184 0.0518307 0.2053013 0.0196109 0.0006451
    100185 0.0379655 0.3561531 0.004141 0.0001412
    100196 0.0873589 0.2289798 0.0161368 0.0026473

  • #2
    if you type
    Code:
    hsearch jaccard
    you will get some links of which one is:
    Code:
    help measure_option
    there are also some user-written commands that are relevant that you will only find in the above way if you already have the commands so here is a way to find one (old) command:
    Code:
    search similari

    Comment


    • #3
      The output of search jaccard points us to
      Code:
      [MV]    cluster programming utilities . Cluster-analysis programming utilities
              (help cluster programming)
      
      [MV]    matrix dissimilarity  . . Compute similarity or dissimilarity measures
              (help matrix dissimilarity)
      
      [MV]    measure_option  . . . Option for similarity and dissimilarity measures
              (help measure_option)
      
      [P]     matrix dissimilarity  . . Compute similarity or dissimilarity measures
              (help matrix dissimilarity)
      Looking into these, we see that among Stata's matrix commands described by the output of help matrix
      Code:
      help matrix dissimilarity
      describes a command that will compute a dissimilarity matrix from Stata variables, and the output of
      Code:
      help measure_option
      tells us the Jaccard measure is available as an option.

      This seems a fruitful place to start. In particular, it will probably benefit you to click the links at the top of each help file to read the PDF containing the full documentation, and if you are not already familiar with Stata's matrix commands (note: not Mata commands), you will certainly want to read that documentation as well.
      Last edited by William Lisowski; 25 May 2020, 10:03.

      Comment


      • #4
        The Jaccard measure as I understand it is based on counts of similar and dissimilar categories. How could it apply here to measured values?

        Comment

        Working...
        X