Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • the two digit of a number in the data

    Hello every one,
    I am trying to use the two digit from the industry classification code to generate industry dummies that are based on the first two digit of the database classification. for example,

    sic_2
    5080
    5080
    5080
    5080
    5080
    3743
    3661
    3661
    3661
    3661
    3661
    3211
    2834

    I would like to generate a new variable that takes the first two digit from the written above sic_2 code. like 50 50 50 37 36 etc.
    I tried several times in excel but the merge did not work for me

    many thanks and regards for your support
    Faisal

  • #2
    I assume that sic_2 is a string variable that can contain leading zeros.
    Code:
    gen group = substr(sic_2,1,2)

    Comment


    • #3
      Note that to use sic_2 to generate industry dummy variables, you may want to -encode- the string variable and then use factor variable notation (see -help fvvarlist-).
      Last edited by Brendan Cox; 22 Oct 2015, 12:07.

      Comment


      • #4
        no reason to go through two steps; just edit the code in #2:
        Code:
         
         gen group = real(substr(sic_2,1,2))
        and then use factor variable notation as per #3

        Comment


        • #5
          Originally posted by Rich Goldstein View Post
          no reason to go through two steps; just edit the code in #2:
          Code:
          gen group = real(substr(sic_2,1,2))
          and then use factor variable notation as per #3
          Quite right. The only reason to use string variables and -encode- is because it would be more normal to see an SIC code of, e.g., "01" rather than "1" in the data and output.

          Comment


          • #6
            Applying an appropriate format will display a SIC code stored as a numeric variable with leading zeroes as appropriate.
            Code:
            . set obs 1
            obs was 0, now 1
            
            . gen group = real(substr("0123",1,2))
            
            . format group %02.0f
            
            . list
            
                 +-------+
                 | group |
                 |-------|
              1. |    01 |
                 +-------+

            Comment


            • #7
              Originally posted by William Lisowski View Post
              Applying an appropriate format will display a SIC code stored as a numeric variable with leading zeroes as appropriate.
              Code:
              . set obs 1
              obs was 0, now 1
              
              . gen group = real(substr("0123",1,2))
              
              . format group %02.0f
              
              . list
              
              +-------+
              | group |
              |-------|
              1. | 01 |
              +-------+
              Ah, good point. Hadn't thought to do that.

              Comment


              • #8
                unfortunately, it did not work for me.
                what I need is to keep only the the first two number of the sic_variable
                for example if the value of sic_2 is 5050 , i only want to keep the first to digits so it becomes 50

                I really appreciate your cooperation

                Comment


                • #9
                  the following table shows the new variable I made in excel
                  year cusip_en sic_2 two_digit of sic_2
                  2006 361105 5080 50
                  2007 361105 5080 50
                  2008 361105 5080 50
                  2009 361105 5080 50
                  2010 361105 5080 50
                  2006 00081T108 2780 27
                  2007 00081T108 2780 27
                  2008 00081T108 2780 27
                  2009 00081T108 2780 27
                  2010 00081T108 2780 27
                  2004 886309 3661 36
                  2005 886309 3661 36
                  2007 886309 3661 36
                  2008 886309 3661 36
                  2010 886309 3661 36
                  1992 915306 7380 73
                  1993 915306 7380 73

                  Comment


                  • #10
                    Please see FAQ Advice esp. #12: "did not work" is not an informative report. But we can try guesswork. If your variable is numeric then

                    Code:
                    gen mysic_2  = floor(sic_2/100)
                    is one technique.

                    Comment


                    • #11
                      Originally posted by Faisal Abdullah View Post
                      unfortunately, it did not work for me.
                      From the FAQ:

                      Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!

                      [...]

                      Never say just that something “doesn't work” or “didn't work”, but explain precisely in what sense you didn't get what you wanted.
                      Several list members proposed solutions that do what you asked for. Which commands did you try and what was the result?

                      Comment

                      Working...
                      X