Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • File name with varibale label from two variable

    what is the code to export stata data into different files with the file name from the label of two variables.

  • #2
    Hi Nanda, I don't understand what you are asking. You will need to provide much more detail. Please read through the FAQ at the top of the page, paying particular attention to section 10. Would you please provide an example dataset with -dataex-? What are the subsets of data you would like to export? Can you tell us which variable labels should be combined based on your example data?

    Comment


    • #3
      I wnat to export data with the file name combining label from dist and psu (101_1)

      dist psu sex age
      101 1 "male" 25
      101 2 "female" 32
      101 3 "male" 36
      101 4 "female" 38
      101 1 "male" 18
      101 2 "female" 15
      102 1 "male" 53
      102 2 "female" 16
      102 3 "male" 29
      102 1 "female" 39
      102 2 "male" 48
      102 3 "female" 17
      102 4 "male" 26
      103 1 "female" 31
      103 2 "male" 59
      103 1 "female" 33
      103 2 "male" 15
      106 1 "female" 10
      106 2 "male" 21
      106 3 "female" 55
      106 4 "male" 59
      106 5 "female" 25
      106 6 "male" 26

      Last edited by Nanda Lal Sapkota; 15 Jul 2022, 21:29.

      Comment


      • #4
        Why not this?

        Code:
        save "101_1.dta"
        do you need a different file for every pair of value labels of dist and psu?

        Comment


        • #5
          Hi Daniel, I am sorry for incomplete querry, actually I want loop to export file 101_1.dta, 101_2.dta, 102_1.dta .... for all combination of dist and psu. Thank you.

          Comment


          • #6
            Can you list all of the labels of dist and all of the labels of psu? Do the labels fall within a certain range?

            Comment


            • #7
              dist " WB BR UP MP KR UR " and psu " 01 02 03 04 05 06"

              Comment


              • #8
                It is a bit confusing that you say the values of dist are "WB BR..." whereas the values you showed in #3 are numbers. Similarly I'm not sure what to make of the "01 02..." when the example data in #3 shows psu simply taking on the values 1, 2, ... I'll take the alphabetic ones and the zero-filled ones as what you really have. (Note: if what you actually have as a value-labeled numeric variable, the code below will not work.)
                Code:
                local dists WB BR UP MP KR UR
                
                
                foreach d of local dists {
                    forvalues p = 1/6 {
                        local pp: display %02.0f `p'
                        preserve
                        keep if dist == "`d'" & psu == "`pp'"
                        if _N > 0 {
                            save `d'_`pp', replace
                        }
                        restore, preserve
                    }
                }
                Notes:

                1. The little dance with -if _N > 0- is there because I assume that some dist may not occur with every psu, and that where there are no such observations, you don't want to save an empty file.

                2. As you can see, I have made some important assumptions about the nature of your actual data, and I am not comfortable that they are correct. And the code will not run if my assumptions about your data are wrong. But your tableau in #3 simply doesn't convey the necessary information. It is for that reason that all Forum members are supposed to read the Forum FAQ before posting, and to pay attention to the advice given there about the helpful ways to show example data, namely the -dataex- command. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data. So if this code gives you error messages, or produces bizarre results, it is likely because your data are not as I have imagined them to be. The solution is to post back using the -dataex- command to give a fully informative data example.

                Comment


                • #9
                  Hi Clyde, actually full data set is large preserve and restore is making stata not responding. The example dataset from datatex is below and I am using stata 15.
                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int dist byte psu int age byte sex
                  101 1 25 3
                  101 1  . .
                  303 3 45 1
                  303 3 40 2
                  303 3 20 2
                  101 1 70 1
                  101 1 60 2
                  204 2 55 1
                  204 2 40 2
                  204 2 66 1
                  705 1 58 1
                  705 1 45 2
                  409 9 67 2
                  409 9 65 1
                  409 9 43 1
                  409 9 30 2
                  409 9 27 1
                  409 9 19 2
                  409 9 20 1
                  409 9 18 2
                  409 9  2 2
                  705 4 25 2
                  705 4 25 1
                  409 1 42 1
                  409 1 36 2
                  409 1 20 1
                  409 1 15 2
                  409 1 39 2
                  409 1 45 1
                  409 1 25 1
                  409 1 20 2
                  409 1 20 1
                  409 1 32 1
                  409 1 26 2
                  409 1  6 1
                  409 1  2 1
                  409 1 45 1
                  409 1 43 2
                  409 1 22 1
                  409 1 20 1
                  409 1 16 1
                  409 1 63 2
                  409 1 69 1
                  409 1 41 1
                  409 1 39 2
                  409 1 76 1
                  409 1 20 1
                  409 1 17 1
                  409 1 10 2
                  409 1 57 1
                  409 1 52 2
                  409 1 20 2
                  409 1 66 1
                  409 1 63 1
                  409 1 57 2
                  409 1 32 1
                  409 1 18 2
                  409 1 55 1
                  409 1 49 2
                  409 1 25 1
                  409 1 21 2
                  409 1  2 2
                  409 1 57 1
                  409 1 52 2
                  409 1 10 2
                  409 1 76 2
                  409 1 50 1
                  409 1 67 1
                  409 1 65 2
                  409 1 97 2
                  409 2 33 2
                  409 2 37 1
                  409 2 60 2
                  409 2 16 1
                  409 2 14 1
                  409 2  7 2
                  409 2 53 1
                  409 2 50 2
                  409 2 25 1
                  409 2 22 2
                  409 2 80 2
                  409 2 84 1
                  409 2 21 1
                  409 2 48 2
                  409 2 49 1
                  409 2 20 1
                  409 2 14 1
                  409 2 56 2
                  409 2 21 2
                  409 2 82 2
                  409 2 46 1
                  409 2 49 2
                  409 2 27 2
                  409 2 22 2
                  409 2  5 1
                  409 2 31 2
                  409 2 15 1
                  409 2 11 1
                  409 2  7 1
                  409 2 56 2
                  end
                  label values dist dist
                  label def dist 101 "WB", modify
                  label def dist 204 "UP", modify
                  label def dist 303 "RS", modify
                  label def dist 409 "SH", modify
                  label def dist 705 "MP", modify
                  label values sex sex
                  label def sex 1 "Male", modify
                  label def sex 2 "Female", modify
                  label def sex 3 "LGBT", modify
                  Last edited by Nanda Lal Sapkota; 16 Jul 2022, 00:06.

                  Comment


                  • #10
                    OK, this will work better with a large file:

                    Code:
                    capture program drop one_file
                    program define one_file
                        local dist = dist[1]
                        local dist: label (dist) `dist'
                        local psu = psu[1]
                        local psu: display %02.0f `psu'
                        save `dist'_`psu', replace
                        exit
                    end
                    
                    runby one_file, by(dist psu) status
                    -runby- is written by Robert Picard and me. It is available from SSC. It will also, because of the -status- option, give you a periodic update of its progress through the data and estimated time remaining.

                    Comment


                    • #11
                      You could also do it by repeatedly loading the full dataset.

                      Code:
                      egen group= group(dist psu)
                      save data, replace
                      qui sum group
                      forval i=1/`r(max)'{
                          use data, clear
                          levelsof dist if group==`i', local(d)
                          local d: label (dist) `d'
                          levelsof psu if group==`i', local(p)
                          local p: display %02.0f `p'
                          keep if group==`i'
                          drop group
                          save `d'_`p', replace
                      }

                      ADDED IN EDIT: You need the extra lines (highlighted) from Clyde's code to use label names and add leading zeros to the filenames.
                      Last edited by Andrew Musau; 16 Jul 2022, 01:13.

                      Comment

                      Working...
                      X