I am using encode to convert string variables into numeric, e.g. "chemistry" might get encoded to 1, "physics" to 2, etc. The problem is that there are 17 variables that use the same string codes but not all of them contain all 100 categories, e.g. for var2, if no one was in chemistry then "physics" could get encoded as 1 instead of 2.
Is there any easy way to get consistent encoding across variables? I can think of harder ways, e.g. a recode command where I recode 100 values, but I wonder if there isn't something simpler.
It will be even harder, of course, if the vars sometimes have different categories, e.g. "physics" appears in var2 but not var1. So, I suppose you would want an encoding based on all the categories in all 17 vars. I guess I could get all the categories in a file, encode it, and then merge, but this too seems tedious. I think I would have to repeat the process 17 times.
This seems like a common enough problem that someone would have written a routine for it. But maybe not.
Is there any easy way to get consistent encoding across variables? I can think of harder ways, e.g. a recode command where I recode 100 values, but I wonder if there isn't something simpler.
It will be even harder, of course, if the vars sometimes have different categories, e.g. "physics" appears in var2 but not var1. So, I suppose you would want an encoding based on all the categories in all 17 vars. I guess I could get all the categories in a file, encode it, and then merge, but this too seems tedious. I think I would have to repeat the process 17 times.
This seems like a common enough problem that someone would have written a routine for it. But maybe not.
Comment