Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Proper display of foreign language (traditional Chinese) using Stata 14

    Hi folks, I am working on a dataset that contains traditional Chinese characters in it. Below is my syntax in order to let Stata properly show the data:

    Code:
    unicode encoding set "GB18030"
    Code:
    unicode translate data.dta, invalid(mark) transutf8
    However, the data still contains unrecognizable Chinese characters, which I don't know why. It seems the people were able to get Chinese properly displayed after using the syntax but apparently, mine is not the case here. Any clues for me please? Thanks.

  • #2
    clear
    unicode analyze ##.dta
    unicode encoding set gb18030
    unicode retranslate ##.dta, invalid(mark) transutf8
    use ##,clear

    Comment


    • #3
      GB18030 is encoding for simplified Chinese. Try "windows-950-2000" for traditional Chinese. Since the dataset was already translated with the wrong encoding, you must restore the dataset to its original form first.

      Code:
       clear
      unicode restore data.dta
      Then you may

      Code:
      clear
      unicode encoding set windows-950-2000
       unicode translate data.dta
      Carefully read -help unicode_translate- will help.

      Comment

      Working...
      X