Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fama French 12 industry out of sic codes

    Hey guys,

    I would like to create a variable for the Fama French 12 industry classifications (ffinds) with values from 1 to 12 representing the respective industry. I do have SIC codes in my data.
    Each industry according to Fama French covers a range of SIC codes. Unfortunately SIC codes are string in my data, so I can't use
    gen ffinds = 1 if ((sic<=0299 & sic>=0100) | (sic>=0700 & sic<=0799) | (sic>=0910 & sic<=0919) | sic==2048)

    Does anybody have an idea how to solve this problem?

    I already tried to recast the variable sic, but this didn't really work. E.g. if the sic code was originally 3089 the value after recasting was 107.

    Thanks in advance.

  • #2
    When you say you tried recasting it, what command did you use? The "destring" command seems like it ought to do exactly what you want.

    Comment


    • #3
      Unfortunately SIC codes are string in my data.........
      I already tried to recast the variable sic, but this didn't really work. E.g. if the sic code was originally 3089 the value after recasting was 107.
      Like how? Afaik recast doesn't work across the domain (from numbers to strings or other way around).

      What you need is destring, but I am definitely more interested in what you've done to turn "3089" into 107.

      For the Sic to FF conversion recode is better:
      Code:
      recode sic (100/299 700/799 910/919 2048 = 1 "Label for first category") (....=2 "Label for second category"), generate(ff)
      Best, Sergiy Radyakin

      Comment


      • #4
        Thanks Form your anderes. I trief the commands recast and destring. Both revealed that value of 107. As long as you browse the data you can see 3089 for the destringed sic Code. But as soon as you select the cell with 3089 for the sic code, the value 107 ist revealed

        Comment


        • #5
          Sounds like your sic variable contains data that was encoded from string to numeric using encode. It that's the case, then the simplest is to restore the original string values using decode and then proceed to map sic codes to Fama code.

          Note that SIC codes are hierarchical and are usually stored as string. You do not need to convert them to numeric to map them to Fama codes. You can use any relational operators to compare strings, even ranges. Here are some (fake) examples

          Code:
          clear
          input str4 sic
          0115  
          1081  
          2399  
          2999  
          3324  
          3679  
          4899  
          5048  
          5181  
          6062  
          7291  
          7374  
          7389  
          7699  
          7941  
          7997  
          8049  
          8093  
          8243  
          8322 
          end
          
          gen ffinds = .
          replace ffinds = 1 if inrange(sic,"0100","1999")
          replace ffinds = 2 if inrange(sic,"2700","3799")
          replace ffinds = 3 if inlist(sic,"7699","7941")
          replace ffinds = 4 if sic == "8093"
          replace ffinds = 5 if sic > "8093"

          Comment

          Working...
          X