I use Stata 14 and am trying to create a CSV file containing strings with accented characters that can be opened easily by users of Excel. Take the data below.
If this CSV file is opened in Excel from File - Open or by double-clicking on the file, the accented characters are not displayed correctly but look like this:
To display the characters correctly in Excel, it is necessary to import the data as a delimited text file and to specify UTF-8 encoding. I tried converting the CSV file with the commands below.
Stata tells me that the file is already in UTF8 format and does nothing.
What am I doing wrong and how can I create a CSV file that can be opened in Excel without going through the file import dialog?
Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str53 NOTE
"Encuesta de Caracterización Socioeconómica Nacional"
"Encuesta de Hogares de Propósitos Múltiples"
end
export delimited using "test.csv", delim(",") replace
Code:
Encuesta de Caracterización Socioeconómica Nacional Encuesta de Hogares de Propósitos Múltiples
Code:
clear unicode analyze "test.csv" unicode encoding set "latin1" unicode translate "test.csv"
Code:
. clear
. unicode analyze "test.csv"
(Directory ./bak.stunicode created; please do not delete)
File summary (before starting):
1 file(s) specified
1 file(s) to be examined ...
File test.csv (text file)
3 lines in file
1 lines ASCII
2 lines UTF-8
File does not need translation, except ...
The file appears to be UTF-8 already. Sometimes files that still need translating
can look like UTF-8. See lines 2 and 3. A total of 2 lines out of 3 appear to be
UTF8.
--------------------------------------------------------------------------------------
File summary:
all files okay
. unicode encoding set "latin1"
(default encoding now latin1)
. unicode translate "test.csv"
(using latin1 encoding)
File summary (before starting):
1 file(s) specified
1 file(s) already known to be UTF8 in previous runs
0 file(s) to be examined ...
(nothing to do)

Comment