Colleagues,
I'm interested in renaming variables with variable label values, as discussed here on Statalist. The problem I'm currently facing is that the names created in such a manner won't be syntactically correct (the variable labels contain spaces and other illegal characters). Consequently, I wanted to ask if there is a neat way of producing syntactically correct variable names from a given string, broadly on the lines of the R make.names function that shortenes the string and removes illegal characters from a given string so it can be used as a variable name? As an example, I have the following string: Aged 24 and under, claiming for over 12 months_Sep-85_Total. I'm not particularly fussy about the actual variable as soon as it it contains information that is crucial for me (age, over period, date, total identifier). I know that I can address this problem by manipulating the strings in loop, but ideally I would address this through one function as I will have to import numerous sets of variable and I am not keen to modify the code specifying how many characters to remove for which strings. On similar lines, it occurs to me that Stata should have something on the lines of the make.names function as it attempts to generate syntactically correct and meaningful variable names when importing the data from foreign files.
I'm interested in renaming variables with variable label values, as discussed here on Statalist. The problem I'm currently facing is that the names created in such a manner won't be syntactically correct (the variable labels contain spaces and other illegal characters). Consequently, I wanted to ask if there is a neat way of producing syntactically correct variable names from a given string, broadly on the lines of the R make.names function that shortenes the string and removes illegal characters from a given string so it can be used as a variable name? As an example, I have the following string: Aged 24 and under, claiming for over 12 months_Sep-85_Total. I'm not particularly fussy about the actual variable as soon as it it contains information that is crucial for me (age, over period, date, total identifier). I know that I can address this problem by manipulating the strings in loop, but ideally I would address this through one function as I will have to import numerous sets of variable and I am not keen to modify the code specifying how many characters to remove for which strings. On similar lines, it occurs to me that Stata should have something on the lines of the make.names function as it attempts to generate syntactically correct and meaningful variable names when importing the data from foreign files.
Comment