Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exporting variable values rather than labels

    Hello

    I intend to use a dataset multiple times. After encoding all my string variables, I exported my dataset in excel. I have labes in my dataset instead of values i.e males and females rather than 0 and 1.
    The problem with this is that each time I use this dataset, I have to encode all the variables again. But if I keep only values, I lose the labels attached to it.
    What would be the appropriate way to be able to preserve both values and labels?

    I am naive to Stata, any help would be highly appreciated.

    Thank you

  • #2
    Why do you (think you) need to switch between Stata and Excel repeatedly?

    Comment


    • #3
      There is one big dataset and I want to keep only the variables I am interested in for each analysis and drop the rest of the variables.
      So I intend to import the big dataset, run analysis 1 then repeat for all subsequent analysis.
      Maybe there is a better way of doing that I might not be aware of?

      Comment


      • #4
        I tried adding numlabels too before exporting the dataset so the data is exported as "1.male". But when I import it again, I can not destring those variables as numbers are followed by text. How can I command Stata to ignore all the text followed by the numbers?
        ?
        Last edited by Razia Aliani; 19 Oct 2020, 05:09.

        Comment


        • #5
          Originally posted by Razia Aliani View Post
          I tried adding numlabels . . .t so the data is exported as "1.male". But when I import it again, I can not destring those variables . . . How can I command Stata to ignore all the text followed by the numbers?
          I'm with Daniel on this, but you can do something like the following after importing the worksheet with prefix numerals to such variables.
          Code:
          replace sex = substr(sex, 1, 1)
          destring sex, replace
          Among alternatives are these two: you can save the value labels separately in their own file and re-label the values during import (you'd automate these steps in an .ADO file program); you can save the value labels in another worksheet and use odbc load, exec() to join them during import using SQL.

          Comment


          • #6
            Thanks a lot Joseph, your code is working just fine.

            On a different note, if I encode my variables of interest each time I import my dataset using the following command, I seem to lose the original order of the variables:

            foreach var of varlist Region - KM_10yr_NR {
            encode `var', gen(`var'01)
            drop `var'
            }

            Is there a possibility that I could retain the original order by adding some code to the command above?

            Comment


            • #7
              Originally posted by Razia Aliani View Post
              Thanks a lot Joseph, your code is working just fine.

              On a different note, if I encode my variables of interest each time I import my dataset using the following command, I seem to lose the original order of the variables:

              foreach var of varlist Region - KM_10yr_NR {
              encode `var', gen(`var'01)
              drop `var'
              }

              Is there a possibility that I could retain the original order by adding some code to the command above?
              I found a package "sencode" that serves this purpose. Thank you so much Daniel and Joseph for your responses.

              Comment


              • #8
                I have one more response.

                Originally posted by Razia Aliani View Post
                There is one big dataset and I want to keep only the variables I am interested in for each analysis and drop the rest of the variables.
                First, there is probably no reason to only keep only the variables of interest for each of the analyses. Second, even if really want that, importing the data repeatedly is both slow and cumbersome. Instead, import the dataset once, then create the subsets of the dataset in Stata. One approach would be something along these lines:

                Code:
                // start fresh
                clear
                
                // import the data (once)
                import ...
                
                // data preparation (once)
                encode [i]...[7i]
                
                // now back up the original dataset
                
                    // either temporarily
                preserve
                
                    // ... or permanently
                save my_complete_dataset
                
                // now create the first subset, eliminating unnecessary variables
                drop ...
                
                    // optionally, save the subset to disk
                save my_subset_1
                
                // run your analyses
                regress ...
                
                // bring back the full dataset
                
                    // ... from disk
                use my_complete_dataset , clear
                
                    // ... or from -preserve-
                restore
                
                // repeat for the second, third, ... subset

                Comment


                • #9
                  Thanks loads Daniel! Your response is extremely helpful

                  Comment

                  Working...
                  X