Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Uniforming string variable with loop

    Hi everyone! I have encountered the following problem while doing some data cleaning.
    I have a list of 30k + observations something like:
    application_ id firm_name
    23 APPLE
    24 APPLE
    24 GOOGLE
    26 APPLE INC.
    27 APPLE & CO.
    32 APPLE INC. (USA)
    72 GOOGLE CORP.
    75 GOOGLE CORP.











    I would like to write a code conforming the different firm names, ideally to the shortest one (e.g. "APPLE").

    My code is:

    levelsof firm_name, local(names)

    foreach n in `names' {
    gen presence = strpos(firm_name, `n') > 0
    replace firm_name = `n' if presence==1
    drop presence
    }

    But I am getting "invalid name" type of error.

    Many thanks for the help!

  • #2
    I'm not sure exactly what you want, but it appears that just extracting the first word of name would work; here is some possible code (untested since you did not follow the advice in the FAQ and use -dataex-)
    Code:
    gen str fim_namew=word(firm_name,1)

    Comment

    Working...
    X