Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • renaming several variables in loop to reshape by category

    i have several variables that i want to rename according to their category for example S2_Q1P- S2_Q7P would be plot and i want to rename them in a sequence such that the result is S2_Q1P_1- S2_Q7P_9.. I use the following code and i get the message

    1 new variable name invalid
    You attempted to rename S2_Q1K to 1. That is an invalid Stata variable name.



    forval i=1/9 {
    rename S2_Q1K `i' S2_Q1P `i' S2_Q1D `i'
    rename S2_Q2AK`i' S2_Q2AP`i' S2_Q2AD`i'
    rename S2_Q2BK`i' S2_Q2BP`i' S2_Q2BD`i'
    rename S2_Q3AK`i' S2_Q3AP`i' S2_Q3AD`i'
    rename S2_Q3BK `i' S2_Q3BP `i' S2_Q3BD `i'
    rename S2_Q4K `i' S2_Q4P `i' S2_Q4D `i'
    rename S2_Q5K `i' S2_Q5P `i' S2_Q5D `i'
    rename S2_Q6K `i' S2_Q6P `i' S2_Q6D `i'
    rename S2_Q7K `i' S2_Q7P `i' S2_Q7D `i'
    }

    i am renaming then in order to reshape such that corresponding questions align with the category name. should i use foreach v in varlist instead but how would i rename all of them in the simplest way possible? or is it not necessary for me to rename at all? I am very confused, earlier i had used forval i=1/9 to rename S2_Q2_01 and it had worked at that time. thank you for all the help i really appreciate it.
    Last edited by Naika Sangroo; 30 May 2020, 10:19.

  • #2
    just to update i used the code following the advice of William Lisowski i had received in another thread earlier and it it worked to rename

    Code:
    rename S2_Q2D S2_Q2AD
    
    foreach l in K P D {
        rename S2_Q1`l' S2_Q1`l'_1    
        rename S2_Q2A`l' S2_Q2A`l'_2
        rename S2_Q2B`l' S2_Q2B`l'_3
        rename S2_Q3A`l' S2_Q3A`l'_4
        rename S2_Q3B`l' S2_Q3B`l'_5
        rename S2_Q4`l' S2_Q4`l'_6
        rename S2_Q5`l' S2_Q5`l'_7
        rename S2_Q6`l' S2_Q6`l'_8
        rename S2_Q7`l' S2_Q7`l'_9
    }

    Code:
    reshape long S2_Q1K_1 S2_Q2AK_2 S2_Q2BK_3 S2_Q3AK_4 S2_Q3BK_5 S2_Q4K_6 S2_Q5K_7 S2_Q6K_8 S2_Q7K_9, i(NEWID) j(kitchen)
    but this does not work. should i rename the entire structure to resemble S2_Q1_1 which worked for an earlier work? reshaping is very complicated. thank you for your help in advance

    Comment


    • #3
      It is not necessary to rename your variables. Here is example code run on one observation of invented data, using what I think your original variable names are.
      Code:
      // generate pretend data
      clear all
      set obs 1
      generate NEWID = 101
      local n 10
      foreach l in K P D {
          generate S2_Q1`l'  = 1 + `n'
          generate S2_Q2A`l' = 2 + `n'
          generate S2_Q2B`l' = 3 + `n'
          generate S2_Q3A`l' = 4 + `n'
          generate S2_Q3B`l' = 5 + `n'
          generate S2_Q4`l'  = 6 + `n'
          generate S2_Q5`l'  = 7 + `n'
          generate S2_Q6`l'  = 8 + `n'
          generate S2_Q7`l'  = 9 + `n'
          local n = `n' + 10
      }
      
      // these are the variables we start with
      ds
      // reshape using identifiers Q1 Q2A Q2B ... Q6 Q7 as kitchen
      reshape long S2_@K S2_@P S2_@D, i(NEWID) j(kitchen) string
      // these are the variables we end with
      ds
      // here's our one observation reshaped into 9
      list, separator(0)
      Code:
      . // these are the variables we start with
      . ds
      NEWID    S2_Q2BK  S2_Q4K   S2_Q7K   S2_Q2BP  S2_Q4P   S2_Q7P   S2_Q2BD  S2_Q4D   S2_Q7D
      S2_Q1K   S2_Q3AK  S2_Q5K   S2_Q1P   S2_Q3AP  S2_Q5P   S2_Q1D   S2_Q3AD  S2_Q5D
      S2_Q2AK  S2_Q3BK  S2_Q6K   S2_Q2AP  S2_Q3BP  S2_Q6P   S2_Q2AD  S2_Q3BD  S2_Q6D
      
      . // reshape using identifiers Q1 Q2A Q2B ... Q6 Q7 as kitchen
      . reshape long S2_@K S2_@P S2_@D, i(NEWID) j(kitchen) string
      (note: j = Q1 Q2A Q2B Q3A Q3B Q4 Q5 Q6 Q7)
      
      Data                               wide   ->   long
      -----------------------------------------------------------------------------
      Number of obs.                        1   ->       9
      Number of variables                  28   ->       5
      j variable (9 values)                     ->   kitchen
      xij variables:
                    S2_Q1K S2_Q2AK ... S2_Q7K   ->   S2_K
                    S2_Q1P S2_Q2AP ... S2_Q7P   ->   S2_P
                    S2_Q1D S2_Q2AD ... S2_Q7D   ->   S2_D
      -----------------------------------------------------------------------------
      
      . // these are the variables we end with
      . ds
      NEWID    kitchen  S2_K     S2_P     S2_D
      
      . // here's our one observation reshaped into 9
      . list, separator(0)
      
           +--------------------------------------+
           | NEWID   kitchen   S2_K   S2_P   S2_D |
           |--------------------------------------|
        1. |   101        Q1     11     21     31 |
        2. |   101       Q2A     12     22     32 |
        3. |   101       Q2B     13     23     33 |
        4. |   101       Q3A     14     24     34 |
        5. |   101       Q3B     15     25     35 |
        6. |   101        Q4     16     26     36 |
        7. |   101        Q5     17     27     37 |
        8. |   101        Q6     18     28     38 |
        9. |   101        Q7     19     29     39 |
           +--------------------------------------+

      Comment


      • #4
        Thank you so much for the help.

        I want to reshape it such that the data in this format below

        Newid for each hh Plot Q1 Q2A Q2B Q3A Q4 Q5 Q6
        101 Kitchen
        102 Pres
        103 Des
        I am so new to stata so i was wondering what does set obs 1 and local n 10 mean? What is it doing to the data? I will redo my work with your codes again. Thank you.

        Comment


        • #5
          Thank you, now that I know what you want I can create a more relevant example.
          Code:
          // input pretend data
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(NEWID S2_Q1K S2_Q2AK S2_Q2BK S2_Q3AK S2_Q3BK S2_Q4K S2_Q5K S2_Q6K S2_Q7K S2_Q1P S2_Q2AP S2_Q2BP S2_Q3AP S2_Q3BP S2_Q4P S2_Q5P S2_Q6P S2_Q7P S2_Q1D S2_Q2AD S2_Q2BD S2_Q3AD S2_Q3BD S2_Q4D S2_Q5D S2_Q6D S2_Q7D)
          101 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39
          end
          
          // these are the variables we start with
          ds
          // delete the prefix
          rename (S2_*) (*)
          // reshape using identifiers K P D as Plot
          reshape long Q1 Q2A Q2B Q3A Q3B Q4 Q5 Q6 Q7, i(NEWID) j(Plot) string
          // these are the variables we end with
          ds
          // Change Plot from K P D to Kitchen Pres Des
          replace Plot = "Kitchen" if Plot=="K"
          replace Plot = "Pres"    if Plot=="P"
          replace Plot = "Des"     if Plot=="D"
          // here's our one observation reshaped into 3
          list, clean noobs
          Code:
          . // these are the variables we start with
          . ds
          NEWID    S2_Q2BK  S2_Q4K   S2_Q7K   S2_Q2BP  S2_Q4P   S2_Q7P   S2_Q2BD  S2_Q4D   S2_Q7D
          S2_Q1K   S2_Q3AK  S2_Q5K   S2_Q1P   S2_Q3AP  S2_Q5P   S2_Q1D   S2_Q3AD  S2_Q5D
          S2_Q2AK  S2_Q3BK  S2_Q6K   S2_Q2AP  S2_Q3BP  S2_Q6P   S2_Q2AD  S2_Q3BD  S2_Q6D
          
          . // delete the prefix
          . rename (S2_*) (*)
          
          . // reshape using identifiers K P D as Plot
          . reshape long Q1 Q2A Q2B Q3A Q3B Q4 Q5 Q6 Q7, i(NEWID) j(Plot) string
          (note: j = D K P)
          
          Data                               wide   ->   long
          -----------------------------------------------------------------------------
          Number of obs.                        1   ->       3
          Number of variables                  28   ->      11
          j variable (3 values)                     ->   Plot
          xij variables:
                                      Q1D Q1K Q1P   ->   Q1
                                   Q2AD Q2AK Q2AP   ->   Q2A
                                   Q2BD Q2BK Q2BP   ->   Q2B
                                   Q3AD Q3AK Q3AP   ->   Q3A
                                   Q3BD Q3BK Q3BP   ->   Q3B
                                      Q4D Q4K Q4P   ->   Q4
                                      Q5D Q5K Q5P   ->   Q5
                                      Q6D Q6K Q6P   ->   Q6
                                      Q7D Q7K Q7P   ->   Q7
          -----------------------------------------------------------------------------
          
          . // these are the variables we end with
          . ds
          NEWID  Plot   Q1     Q2A    Q2B    Q3A    Q3B    Q4     Q5     Q6     Q7
          
          . // Change Plot from K P D to Kitchen Pres Des
          . replace Plot = "Kitchen" if Plot=="K"
          variable Plot was str1 now str7
          (1 real change made)
          
          . replace Plot = "Pres"    if Plot=="P"
          (1 real change made)
          
          . replace Plot = "Des"     if Plot=="D"
          (1 real change made)
          
          . // here's our one observation reshaped into 3
          . list, clean noobs
          
              NEWID      Plot   Q1   Q2A   Q2B   Q3A   Q3B   Q4   Q5   Q6   Q7  
                101       Des   31    32    33    34    35   36   37   38   39  
                101   Kitchen   11    12    13    14    15   16   17   18   19  
                101      Pres   21    22    23    24    25   26   27   28   29
          I am so new to stata so i was wondering what does set obs 1 and local n 10 mean? What is it doing to the data?
          I don't think I can give a good answer to these questions because the answers would probably depend on other things you don't yet know. You will notice that in this example, I used dataex to present the data that I invented for testing and demonstration.

          I'm sympathetic to you as a new user of Stata - it's a lot to absorb. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

          When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. There are a lot of examples to copy and paste into Stata's do-file editor to run yourself, and better yet, to experiment with changing the options to see how the results change.

          All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu. The objective in doing the reading was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

          Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

          Comment


          • #6
            i apologize for posting on my own post many times but i just want to show my update and get advice


            i used the code
            Code:
            foreach l in K P D {
                rename S2_Q1`l' S2_`l'_1    
                rename S2_Q2A`l' S2_`l'_2
                rename S2_Q2B`l' S2_`l'_3
                rename S2_Q3A`l' S2_`l'_4
                rename S2_Q3B`l' S2_`l'_5
                rename S2_Q4`l' S2_`l'_6
                rename S2_Q5`l' S2_`l'_7
                rename S2_Q6`l' S2_`l'_8
                rename S2_Q7`l' S2_`l'_9
            }
            
            reshape long S2_K_  S2_P_ S2_D_, i(NEWID) j(plot)

            and i get the same result as William above. the reason i couldnt use William's code exactly is because i could not explain it to myself what was happening and in case someone questions me then i want to be able to explain. but what i need is data in this table when i use reshape wide i get the message ambiguous abbreviation or invalid syntax. i would be so grateful to get any help resolving this been stuck for hours. Thank you
            Newid for each hh Plot Q1 Q2A Q2B Q3A Q4 Q5 Q6
            101 Kitchen
            102 Pres
            103 Des

            Comment


            • #7
              I think that while you were posting #6 I posted #5 that addresses your question in #4 repeated in #6.

              Comment


              • #8
                Hi William thank you so much! i have learned so much from you just today this option is so convenient (S2_*) (*). thank you for the advice on getting better with stata as well. i really appreciate all the guidance :-)

                Comment


                • #9
                  I am now trying to reshape the following variables


                  Code:
                  keep NEWID S1C_Q0_01-S1C_Q0_18 S1C_Q5_01-S1C_Q5_18 S3_Q1K S3_Q7K- S3_Q10K S3_Q18K-S3_Q19K S3_Q1P S3_Q7P-S3_Q10P S3_Q18P-S3_Q20P S3_Q1D S3_Q7D- S3_Q10D S3_Q18D-S3_Q23 S3_Q28K-S3_Q31KB S3_Q28P-S3_Q31PB S3_Q28D-S3_Q31DB S3_Q37- S3_Q37B
                  Code:
                  *Stata recognized 01-09 as strings thus for easy computation rename variables 1-9
                  forval i = 1/9 {
                      rename S1C_Q0_0`i' S1C_Q0_`i'
                      rename S1C_Q5_0`i' S1C_Q5_`i'
                  }
                  
                  *rename to drop the prefix
                  rename (S3_*) (*)
                  foreach `l' in P D {
                      rename Q20`l' Q20`l'_16        
                  }
                  
                  rename Q37 Q37_17
                  rename Q37B Q37B_18
                  This part all the codes work but when i run a loop below to rename i get the message invalid syntax. i had 1st renamed using rename Q1`l' Q1`l'_1 but that also says invalid syntax. I am not sure what i am doing wrong i am renaming so i can reshape


                  Code:
                  *rename to avoid string
                  
                  foreach `l' in K P D {
                      rename Q1`l' Q1`l'1
                      rename Q7`l' Q7`l'2
                      rename Q8`l' Q8`l'3
                      rename Q9`l' Q9`l'4
                      rename Q10`l' Q10`l'5
                      rename Q18`l' Q18`l'6
                      rename Q19`l' Q19`l'7
                      rename Q28`l' Q28`l'8
                      rename Q28`l'B Q28`l'B9
                      rename Q29`l' Q29`l'10
                      rename Q29`l'B Q29`l'B11
                      rename Q30`l' Q30`l'12
                      rename Q30`l'B Q30`l'B13
                      rename Q31`l' Q31`l'14
                      rename Q31`l'B Q31`l'B15
                  }
                  ultimately i plan to use
                  Code:
                  reshape long S1C_Q0_ S1C_Q5_ Q1 Q7 Q8 Q9 Q10 Q18 Q19 Q20 Q28 Q28B Q29 Q29B Q30 Q30B Q31 Q31B Q37 Q37B , i(NEWID) j(labor)
                  to generate a table that is would the reshape code work?
                  Newid S1C_Q0 S1c_Q5 Q1 Q7 Q8 Q.....
                  1 1 1
                  2 1 0
                  3 1 1
                  3 2 0

                  Comment


                  • #10
                    Replace
                    Code:
                    foreach `l' in K P D {
                    with
                    Code:
                    foreach l in K P D {

                    Comment

                    Working...
                    X