Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to rename variables with a separate header file?

    I am trying to merge a .txt header file to my dataset (with over 114 variables). Is there a way mass rename variables based on another file?

    This is the dataset
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(var1 var2 var3)
    1 2 3
    2 3 4
    end
    This is the header file
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 varcode str6 varname
    "1" "Age"   
    "2" "Race"  
    "3" "Gender"
    end
    This is the desired file
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(Age Race Gender)
    1 2 3
    2 3 4
    end

  • #2
    Since order presumably matters, you wanna loop over observations (only time I'll say this is a good idea!!) and build a macro. That is, you wanna loop over the rows of dataset 2 and construct a macro (say names) that has Age Race Gender

    then in the first frame/dataset you just do
    Code:
    rename * (`names')
    this is my initial reaction, but presumably Nick Cox or daniel klein have better ideas, assuming I'm understanding you will.

    Comment


    • #3
      Coincidentally, I was working on crafting a program last night for this very purpose trying to redo outlandishly long variable names in a validation dataset without having to call the original names out for each rename. See my example program with the auto data set with a sample call at the end of the code block. The part I was struggling with was to make sure I don't miss a new name halfway through the newlist and could not see how I could make my program capture where I forgot to name a new one.
      But I will admit Jared Greathouse's one liner does the trick in much fewer line(s) of code..

      Code:
      sysuse auto, clear
      
      cap program drop renamer
      program define renamer
          version 18
          syntax namelist
          qui ds
          tokenize `r(varlist)'
          local w: word count `r(varlist)'
          local nc: word count `namelist'
          cap assert `w'==`nc'
              if _rc !=9 {
                          forval i = 1/`w'{
                  qui rename ``i'' `:word `i' of `namelist''
                      }
                  }
              else if _rc {
                  di "Number of new vars dont match"
              }
      end
      
      renamer m p mp rep head tr wei len tur dis gr for

      Comment


      • #4
        It wouldn't be a one liner technically, since you'd need to two a loop that's at least.... 4 lines, I think, but yeah I think my approach may be cleaner, even though I generally don't like to loop over observations

        Comment


        • #5
          Is the header file already in Stata as a dta file? If not, then based on the information in #1, I'd say you want something like this:

          Code:
          use "dataset"
          
          tempname fh
          
          file open `fh' using "header.txt" , read
          
          file read `fh' line
          
          while ( !r(eof) ) {
              
              gettoken number name : line
              
              rename var`number' `name'
              
              file read `fh' line
          }
          
          file close `fh'

          Comment


          • #6
            Originally posted by daniel klein View Post
            Is the header file already in Stata as a dta file? If not, then based on the information in #1, I'd say you want something like this:

            Code:
            use "dataset"
            
            tempname fh
            
            file open `fh' using "header.txt" , read
            
            file read `fh' line
            
            while ( !r(eof) ) {
            
            gettoken number name : line
            
            rename var`number' `name'
            
            file read `fh' line
            }
            
            file close `fh'
            You are right, daniel klein. I realized later that my solution was applicable only to a single dta file with existing variable renaming and not a solution for what the OP asked for. That said, your reply is quite instructive with stuff I did not know including r(eof) etc.. My homework for the day

            Comment


            • #7
              Awkward but not outrageous. Use merge 1:1 _n to combine the two datasets.

              Then it's a loop

              Code:
              forval j = 1/3 { 
                   rename var`j' `=varname[`j']' 
              }
              for your own value of 3.

              Comment

              Working...
              X