Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating and appending the check digit of a SEDOL number

    Hello everyone, I am trying to calculate the check digit (7th) of a SEDOL number from an existing 6 digit (alphanumeric or numeric) SEDOL number. Does anyone have a .do file for this command in STATA? I found many in other languages, but not in Stata :http://rosettacode.org/wiki/SEDOLs or https://en.wikipedia.org/wiki/SEDOL
    sedolibes
    Examples of codes (all securities listed on LSE)
    769663
    B00CRV
    B00DF1
    B00FPT


    Thank you in advance,
    A.

  • #2
    Code:
    clear*
    input str6 sedol
    769663
    B00CRV
    B00DF1
    B00FPT
    
    end
    
    local alphabet = c(ALPHA)
    local alphabet: subinstr local alphabet " " "", all
    matrix weights = (1, 3, 1, 7, 3, 9, 1)
    
    // CLEAN THE SEDOL VARIABLE
    replace sedol = upper(trim(itrim(sedol)))
    assert strlen(sedol) == 6
    
    gen checksum = 0
    gen this_char = ""
    gen alpha_location = .
    gen to_add = .
    forvalues j = 1/6 {
          replace this_char = substr(sedol, `j', 1)
          replace alpha_location = strpos("`alphabet'", this_char) 
          replace to_add = .
          replace to_add = weights[1, `j']*(alpha_location+9) if alpha_location > 0
          replace to_add = weights[1, `j']*real(this_char) if inrange(this_char, "0", "9")
          replace checksum = checksum + to_add
    }
    replace checksum = mod(checksum, 10)
    gen byte check_digit = 10 - checksum
    list, noobs clean
    This is a little tricky because Stata does not allow the free interchange between characters and numbers that is found in, for example, C and its descendants.

    Anyway, I think I got all the bugs out of this.

    Notes: 1. If the variable sedol contains invalid characters (i.e. not letters and digits) the result is a missing value.
    2. The sequence of -replace to_add = - statements inside the loop can be condensed to a single statement using nested -cond()- calls, but I found it completely unreadable that way.
    3. Evidently, you can drop the interim variables checksum, this_char, alpha_location, and to_add after the code runs.


    Comment


    • #3
      Question was cross-posted at http://stackoverflow.com/questions/3...ls-check-digit

      Please see the FAQ for our explicit policy on cross-posting, which is that you should tell us about it.

      8. May I cross-post to other forums?

      People posting on Statalist may also post the same question on other listservers or in web forums. There is absolutely no rule against doing that.
      But if you do post elsewhere, we ask that you provide cross-references in URL form to searchable archives. That way, people interested in your question can quickly check what has been said elsewhere and avoid posting similar comments. Being open about cross-posting saves everyone time.
      If your question was answered well elsewhere, please post a cross-reference to that answer on Statalist.
      I've cross-referenced this thread on Stack Overflow.

      Comment


      • #4
        Dear Clyde Schechter, thank you for your code to calculate the check digit of SEDOL. It seems to work, but there are instances where the check digit is 10, which cannot be, as the check digit has to be 0-9. For example:
        sedol=B0J6N1 checksum=0 this_char=1 alpha_location=0 to_add=9 check_digit=10
        or
        sedol=000204 checksum=0 this_char=4 alpha_location=0 to_add=36 check_digit=10

        How do I fix this issue?

        Thank you in advance,
        A.

        Comment


        • #5
          You will also perhaps have noticed that there are no cases where the check digit is 0.

          Try adding the third line below to Clyde's excellent code. If adding 10 to the checksum gives a multiple of 10, then so will adding 0.
          Code:
          replace checksum = mod(checksum, 10)
          gen byte check_digit = 10 - checksum
          replace check_digit = 0 if check_digit==10
          Alternatively,
          Code:
          replace checksum = mod(checksum, 10)
          gen byte check_digit = mod((10 - checksum),10)
          Last edited by William Lisowski; 06 Nov 2015, 06:09.

          Comment


          • #6
            Dear Clyde Schechter,
            with the change to
            gen byte check_digit = mod((10 - checksum),10)
            the command seems to work better and now return check digit 0 to 9 instead of 1 to 10
            Thank you for your earlier help,

            A.

            Comment


            • #7
              Yes, and thanks to William Lisowski for fixing that error.

              Comment

              Working...
              X