Hello everyone!
I have a large dataset, in long format, that I would like to reshape to wide format. However, some of my variable names are too long (but less than 32 characters long, so no errors when importing to Stata), and when I try the reshape command, I get an error that "thisvariableislongbutlessthan32_1" invalid variable name r(198);.
Is there a way around this? Unfortunately, it is not possible to rename the variables before importing the dataset, I use REDCap and if I change the variable name there, I lose all data that was associated with that variable.
Is there a way to change variable names for multiple variables at once? My dataset has 1423 variables and I found that 104 of them have 30 to 32 characters, so changing names one by one would consume a lot of time. It would be really helpful if I could do something like drop the last 4 or 5 characters of variables with more than 30 characters, for example (or the first 5 characters, anything really, as long as there is a pattern). Also, the variables names have some level of information, so I would like to avoid renaming them to "var1", "var2", "var3", etc., if possible (for example, one variable is "viral_infection_anatomopatholog": it would be fine as "viral_infection_anatomopat", and much more informative than "var97").
Of course, if there is another way to solve this problem (without renaming variables), that would be very welcome too!
Below, an artificial dataset to illustrate my problem:
. reshape wide variablewithtoomanycharacters_01- variablewithtoomanycharacters_05, i(id) j(_j) string
(note: j = _1 _2 _3 _4 _5)
variablewithtoomanycharacters_01_1 invalid variable name
r(198);
Thank you!!
Leo
I have a large dataset, in long format, that I would like to reshape to wide format. However, some of my variable names are too long (but less than 32 characters long, so no errors when importing to Stata), and when I try the reshape command, I get an error that "thisvariableislongbutlessthan32_1" invalid variable name r(198);.
Is there a way around this? Unfortunately, it is not possible to rename the variables before importing the dataset, I use REDCap and if I change the variable name there, I lose all data that was associated with that variable.
Is there a way to change variable names for multiple variables at once? My dataset has 1423 variables and I found that 104 of them have 30 to 32 characters, so changing names one by one would consume a lot of time. It would be really helpful if I could do something like drop the last 4 or 5 characters of variables with more than 30 characters, for example (or the first 5 characters, anything really, as long as there is a pattern). Also, the variables names have some level of information, so I would like to avoid renaming them to "var1", "var2", "var3", etc., if possible (for example, one variable is "viral_infection_anatomopatholog": it would be fine as "viral_infection_anatomopat", and much more informative than "var97").
Of course, if there is another way to solve this problem (without renaming variables), that would be very welcome too!
Below, an artificial dataset to illustrate my problem:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float id str10 _j float(variablewithtoomanycharacters_01 variablewithtoomanycharacters_02 variablewithtoomanycharacters_03 variablewithtoomanycharacters_04 variablewithtoomanycharacters_05) 1 "_1" 1 1 1 1 1 1 "_2" 1 1 1 1 1 1 "_3" 1 1 1 1 1 1 "_4" 1 1 1 1 1 2 "_1" 1 1 1 1 1 2 "_2" 1 1 1 1 1 2 "_3" 1 1 1 1 1 3 "_1" 1 1 1 1 1 3 "_2" 1 1 1 1 1 3 "_3" 1 1 1 1 1 3 "_4" 1 1 1 1 1 3 "_5" 1 1 1 1 1 end
(note: j = _1 _2 _3 _4 _5)
variablewithtoomanycharacters_01_1 invalid variable name
r(198);
Thank you!!
Leo
Comment