My task is to append many separated .dta files to create a panel dataset. For convenience, I used the Chinese characters in the first row as the variable names. What made me mad is, after appending, there are many visually indistinguishable variable name coexist in the paned dataset, like

It's most likely there are some invisible spaces around one of the variable name. But apparently the --trim-- command did not work in my case. This thread https://www.stata.com/statalist/arch.../msg00891.html provides me some intuition, but I cannot quite figure out how to read (and use) the result from --charlist-- command.
In the end, a data sample is attached and hopefully someone can check it and give me some advice. In the data sample , there are firm id and year id, plus three pairs of visually indistinguishable variable names, I also attached several .dta files before append. My stata version is Stata MP14
https://www.dropbox.com/s/0pqef9hgta...ample.rar?dl=0
It's most likely there are some invisible spaces around one of the variable name. But apparently the --trim-- command did not work in my case. This thread https://www.stata.com/statalist/arch.../msg00891.html provides me some intuition, but I cannot quite figure out how to read (and use) the result from --charlist-- command.
In the end, a data sample is attached and hopefully someone can check it and give me some advice. In the data sample , there are firm id and year id, plus three pairs of visually indistinguishable variable names, I also attached several .dta files before append. My stata version is Stata MP14
https://www.dropbox.com/s/0pqef9hgta...ample.rar?dl=0

Comment