Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to chop off the last 2 or 3 digits

    I have variable name CUSIP which has either 6, 8 or 9 character length (Alpha numeric or Numeric), I want to chop off the last 2 or 3 (characters or digits) which has length of 8 or 9 characters, in order to make all of them to 6 digit.

  • #2
    Try
    Code:
    gen cusip_short = substr(CUSIP, 1, 6)

    Comment


    • #3
      Originally posted by Jesse Wursten View Post
      Try
      Code:
      gen cusip_short = substr(CUSIP, 1, 6)
      It do not work.

      Comment


      • #4
        What error message do you get? Is your variable actually a string?

        Comment


        • #5
          No error msg but when I check the values those are still in 8 or 9 Character long. Here is my data. first 58 values are 6 character long so generated dataex for 75 observations here
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str17 cusip int fyear
          "360206"    1990
          "360206"    1991
          "360206"    1992
          "360206"    1993
          "360206"    1994
          "360206"    1995
          "360206"    1996
          "360206"    1997
          "360206"    1998
          "360206"    1999
          "360206"    2000
          "360206"    2001
          "360206"    2002
          "360206"    2003
          "360206"    2004
          "360206"    2005
          "360206"    2006
          "361105"    1988
          "361105"    1989
          "361105"    1990
          "361105"    1991
          "361105"    1992
          "361105"    1993
          "361105"    1994
          "361105"    1995
          "361105"    1996
          "361105"    1997
          "361105"    1998
          "361105"    1999
          "361105"    2000
          "361105"    2001
          "361105"    2002
          "361105"    2003
          "361105"    2004
          "361105"    2005
          "375204"    1997
          "375204"    1998
          "375204"    1999
          "375204"    2000
          "375204"    2001
          "375204"    2002
          "375204"    2003
          "375204"    2004
          "375204"    2005
          "375204"    2006
          "781104"    1989
          "781104"    1990
          "781104"    1991
          "781104"    1992
          "781104"    1993
          "781104"    1994
          "782102"    1992
          "782102"    1993
          "782102"    1994
          "782102"    1995
          "782102"    1996
          "782102"    1997
          "00086T103" 1995
          "00086T103" 1996
          "00086T103" 1997
          "00086T103" 1998
          "00086T103" 1999
          "00086T103" 2000
          "00086T103" 2001
          "00086T103" 2002
          "00086T103" 2003
          "00086T103" 2004
          "00086T103" 2005
          "00086T103" 2006
          "872309"    1989
          "872309"    1990
          "872309"    1991
          "872309"    1992
          "872309"    1993
          "872309"    1994
          end
          Last edited by Jahan zee; 31 May 2017, 03:34.

          Comment


          • #6
            Code:
            clear
            input long whatever
            123456
            1234567
            12345678
            end
            
            gen whatev = ///
            cond(whatever < 1e6, whatever, cond(whatever  < 1e7, floor(whatever/10), floor(whatever/100)))
            
            list
            
                 +-------------------+
                 | whatever   whatev |
                 |-------------------|
              1. |   123456   123456 |
              2. |  1234567   123456 |
              3. | 12345678   123456 |
                 +-------------------+
            
            .
            or

            Code:
            gen WHATEV = real(substr(string(whatever, "%9.0f"), 1, 6))
            Last edited by Nick Cox; 31 May 2017, 04:12.

            Comment


            • #7
              Did you look at the cusip_short variable? Because if it did not give an error message, I do not see how the new variable could be longer than 6 characters. You can use replace CUSIP = ... instead of gen cusip_short = ... if you want to replace the original variable.

              Nick Cox
              The original variable is already a string, as there are "letters" in there, eg "00086T103".

              Comment


              • #8
                Jesse: Quite. I didn't see #5 when I was posting #6. I can't see why your solution in #2 is not bang on.

                Comment


                • #9
                  There are two possibilities here.

                  1) In posts #1 and #2 the variable name was CUSIP, but in the sample data in post #5 the variable name is cusip. However, applying the code from #2 to the data from #5 generates an error message, but perhaps this was overlooked.

                  2) More likely, after applying the code from #2 to the data, the original poster looked at the existing variable cusip rather than the new variable cusip_short generated by the code.
                  Code:
                  . gen cusip_short = substr(cusip, 1, 6)
                  
                  . list in 58/69, clean
                  
                             cusip   fyear   cusip_~t  
                   58.   00086T103    1995     00086T  
                   59.   00086T103    1996     00086T  
                   60.   00086T103    1997     00086T  
                   61.   00086T103    1998     00086T  
                   62.   00086T103    1999     00086T  
                   63.   00086T103    2000     00086T  
                   64.   00086T103    2001     00086T  
                   65.   00086T103    2002     00086T  
                   66.   00086T103    2003     00086T  
                   67.   00086T103    2004     00086T  
                   68.   00086T103    2005     00086T  
                   69.   00086T103    2006     00086T

                  Comment


                  • #10
                    Originally posted by William Lisowski View Post
                    There are two possibilities here.

                    1) In posts #1 and #2 the variable name was CUSIP, but in the sample data in post #5 the variable name is cusip. However, applying the code from #2 to the data from #5 generates an error message, but perhaps this was overlooked.

                    2) More likely, after applying the code from #2 to the data, the original poster looked at the existing variable cusip rather than the new variable cusip_short generated by the code.

                    Thanks I understand the problem.

                    Comment

                    Working...
                    X