Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to change the case for special characters

    Hello

    I have a variable with the municipalities of each datapoint where each municipality is written in uppercase letters (e.g., SAN VICENTE DEL CAGUÁN). I want them to be capitalized (e.g,, San Vicente de Caguán). Hence I tried using the strproper()-function, but it is not working quite as I was hoping. First of all, it does not understand the special characters such as á, hence they are left capitalized *and* the next letter is left capitalized (e.g., CaguÁN). Furthermore, also capitalizes "del" and "de" and such which I was kind of hoping to leave all in lower case. It's a long list of municipalities, is there anyway of fixing any of these glitches in a more efficient way than doing all the names with special characters and "de" by hand?

    Best,
    Maria

  • #2
    Originally posted by Mariia Chavez View Post
    I have a variable with the municipalities of each datapoint where each municipality is written in uppercase letters (e.g., SAN VICENTE DEL CAGUÁN). I want them to be capitalized (e.g,, San Vicente de Caguán).
    If you can wait a short while, someone will post an elegant regular expression solution.

    But if not, then you can try something like the following.
    Code:
    clear *
    
    input str200 muni_name
    "SAN VICENTE DEL CAGUÁN"
    end
    
    *
    * Begin here
    *
    generate str tmn = ustrtitle(muni_name)
    foreach lc in "Del " "De " {
        replace tmn = usubinstr(tmn, "`lc'", ustrlower("`lc'"), .)
    }
    list, noobs

    Comment


    • #3
      Thank you! This was perfect

      Comment


      • #4
        You're welcome. I'm glad that it worked out well for you.

        Comment


        • #5
          This is similar to #3, but avoids the loop:

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input strL muni_name
          "SAN VICENTE DEL CAGUÁN" 
          "VALLE DE BRAVO"          
          "VILLA DEL ROSARIO"       
          "SAN MARTÍN DE LOS ANDES"
          "SANTIAGO DEL ESTERO"     
          "PEÑAS DE SAN PEDRO"     
          "CASTILLA DEL PINO"       
          "SAN PEDRO DE MACORÍS"   
          end
          
          
          gen wanted = ustrregexra(ustrtitle(muni_name, "es"), "\bD(e|el)\b", "d$1")
          Res.:

          Code:
          . l, sep(0)
          
               +---------------------------------------------------+
               |               muni_name                    wanted |
               |---------------------------------------------------|
            1. |  SAN VICENTE DEL CAGUÁN    San Vicente del Caguán |
            2. |          VALLE DE BRAVO            Valle de Bravo |
            3. |       VILLA DEL ROSARIO         Villa del Rosario |
            4. | SAN MARTÍN DE LOS ANDES   San Martín de Los Andes |
            5. |     SANTIAGO DEL ESTERO       Santiago del Estero |
            6. |      PEÑAS DE SAN PEDRO        Peñas de San Pedro |
            7. |       CASTILLA DEL PINO         Castilla del Pino |
            8. |    SAN PEDRO DE MACORÍS      San Pedro de Macorís |
               +---------------------------------------------------+

          Comment

          Working...
          X