Advanced methods to rename variables

Milan Quentel

Join Date: Nov 2016

Posts: 52
#1

Advanced methods to rename variables

01 Nov 2016, 02:30

Dear Statalist User,

since I have imported my data from excel to Stata, all variables got named A, B, C (and so on) after the usual excel convention while the "real" (as in meaningful) variable names appear in the first observation. I have found a useful addon (renvars) as well as some code originally written by Nick Cox in another thread:

renvars , map(word(@[1], 1))

that allows me to use the first word in the first observation as a new variable name.

However, I have two problems.

First, in my excel data some column headers contain characters that are not allowed as variable names in Stata (such as ./-). So if I run the code above, I get the error message "/ invalid name". I found the addon cleanchars that removes all such characters in the data set. However, I would like them to be removed only in the first column. How could I do this?

Secondly, some of the names do start with the same name. For example, the two variables "firm name" and "firm number of employees", so if I run the code as above, I will get the error message "new variable name entered more than once". How can I adress the issue? Is there an option renvars or any other command/addon that allows me to use all words as variable name, so each variable is uniquely identified?

Ideally, I am looking for an option that solves both problems by running the command "by force", i.e. deleting characters when they are not possible as names and adding identifiers (eg. numbers: firm1, firm2, ...) if variable names occur more than once.

Many thanks,
Milan Quentel

Last edited by Milan Quentel; 01 Nov 2016, 02:32. Reason: Added tags
Tags: cleanchars, import from excel, rename variables, renvars
daniel klein

Join Date: Mar 2014

Posts: 3827
#2

01 Nov 2016, 02:54

Try

Code:

renvars , map(strtoname(word(@[1], 1)))

Best
Daniel
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35468
#3

01 Nov 2016, 02:57

Thanks for mentioning renvars but it was jointly written with Jeroen Weesie and should be cited as published in the Stata Journal (FAQ Advice #12).

Best solution: I would read the data in again with import excel, firstrow which has the functionality for this problem. Importing metadata as data is best avoided at source. This is a case of Don't do that then! http://www.catb.org/jargon/html/D/Do...hat-then-.html

Otherwise (second best solution at best):

Code:

renvars , map(strtoname(word(@[1], 1)))

would solve part of your problem but it can do nothing about duplicate names. The problem of duplicate names obtained in this way is undoubtedly programmable, but Stata has done it already in import excel and it's likely to be a waste of anybody's time to try to do it for themselves.
1 like
Comment
Milan Quentel

Join Date: Nov 2016

Posts: 52
#4

01 Nov 2016, 03:15

Hi Daniel, Hi Nick, and everyone else who is reading along,

thanks for your answers. Your suggestion, Daniel, is a good way to adress the first part of the problem. This being said, I did not now about the import excel, firstrow option and it is a straightforward way to deal with both problems at the same time. So, thank you Nick.

[As for correct citations: Sorry I missed that advice. Will cite correctly in my next post.]
Comment

Announcement

Advanced methods to rename variables

Comment

Comment

Comment