Dear Statalist User,
since I have imported my data from excel to Stata, all variables got named A, B, C (and so on) after the usual excel convention while the "real" (as in meaningful) variable names appear in the first observation. I have found a useful addon (renvars) as well as some code originally written by Nick Cox in another thread:
renvars , map(word(@[1], 1))
that allows me to use the first word in the first observation as a new variable name.
However, I have two problems.
First, in my excel data some column headers contain characters that are not allowed as variable names in Stata (such as ./-). So if I run the code above, I get the error message "/ invalid name". I found the addon cleanchars that removes all such characters in the data set. However, I would like them to be removed only in the first column. How could I do this?
Secondly, some of the names do start with the same name. For example, the two variables "firm name" and "firm number of employees", so if I run the code as above, I will get the error message "new variable name entered more than once". How can I adress the issue? Is there an option renvars or any other command/addon that allows me to use all words as variable name, so each variable is uniquely identified?
Ideally, I am looking for an option that solves both problems by running the command "by force", i.e. deleting characters when they are not possible as names and adding identifiers (eg. numbers: firm1, firm2, ...) if variable names occur more than once.
Many thanks,
Milan Quentel
since I have imported my data from excel to Stata, all variables got named A, B, C (and so on) after the usual excel convention while the "real" (as in meaningful) variable names appear in the first observation. I have found a useful addon (renvars) as well as some code originally written by Nick Cox in another thread:
renvars , map(word(@[1], 1))
that allows me to use the first word in the first observation as a new variable name.
However, I have two problems.
First, in my excel data some column headers contain characters that are not allowed as variable names in Stata (such as ./-). So if I run the code above, I get the error message "/ invalid name". I found the addon cleanchars that removes all such characters in the data set. However, I would like them to be removed only in the first column. How could I do this?
Secondly, some of the names do start with the same name. For example, the two variables "firm name" and "firm number of employees", so if I run the code as above, I will get the error message "new variable name entered more than once". How can I adress the issue? Is there an option renvars or any other command/addon that allows me to use all words as variable name, so each variable is uniquely identified?
Ideally, I am looking for an option that solves both problems by running the command "by force", i.e. deleting characters when they are not possible as names and adding identifiers (eg. numbers: firm1, firm2, ...) if variable names occur more than once.
Many thanks,
Milan Quentel
Comment