Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • remove special characters from string

    Dears,

    I know this topic has been explained in other posts, but I cannot solve my problem.
    I have more than 1000 string names in one var with some special charaters reported in stata like this: Sp�ldzielnia, Sch�negger (they appear with a rectangle).
    I tried to replace this character with the following code but I get an error message.

    Code:
    replace var= subinstr(var, `=char(160)', "", .)
    � invalid name
    r(198);
    Could you please help me to find a solution?

    thanks

    Federica

  • #2
    Try:
    Code:
    replace var= subinstr(var, "`=char(160)'", "", .)

    Comment


    • #3
      No, is not working :-(

      Comment


      • #4
        No, is not working :-(
        No, is not helping! "is not working" could mean lots of different things. It isn't possible to troubleshoot imaginary problems.

        To get help post an example of your data (use -dataex-), the exact code you ran (copy/pasted from your log file or the Results window without any editing), any output and error messages Stata gave you (also copy/pasted without editing), the state of the example data after the code was run (using -dataex- again) and, if it isn't obvious, an explanation of how what you got isn't what you want.

        Comment


        • #5
          Yes I know you are right! :-)
          I "solved" the problem going back to the original excel spreadsheet copying again the variable and using the following command to change up to low case:

          Code:
          gen new_var=substr(var, 1, 1)+ lower(substr(var, 2, .))
          however i have not solved the issue completely. What I get with this command is the following :
          LÁcteos terra de melide
          Hermanos lÓpez
          Leche rÍo

          while I would like also the special character to be converted in low case and each starting letter be uppercase, like the following:
          Lácteos Terra de Melide
          Hermanos lópez
          Leche río

          Nevertheless I would like to understand why i was not successful in achieving the results i wanted by using the previous command. I copy here below the original data. The command I used to change the string is the one above ( I have also tried the one you suggested).
          Uni�ns agrarias
          J�venes Agricultores
          Cooperativa Mop�n

          Comment


          • #6
            Hi Federica,

            my recommendation would be: Don't copy-and-paste from Excel spreadsheets; use -import excel- instead (if you've already done so: great!). It features an option -locale("locale")- which enables Stata to import the source data in the correct encoding straight away. This will make end-of-pipe conversion unnecessary.

            All you need to find out is the character encoding in your source Excel spreadsheet. This is dependent from the locale settings on the computer that originally created the spreadsheet -- you could try 'cp1252' as a start.

            See -help import excel- for details.

            Regards
            Bela

            Comment

            Working...
            X