Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xpose/sxpose keep a variable observations to assign as labels to new variables

    Dear Statalisters,
    I am trying to solve the following problem: assign the observations of a variable as labels to the new variables generated after -sxpose2-.
    Thanks to your contribution, I managed to import multiple Excel files from a folder and clean the data.
    I've got the following structure of the data:

    id name v1 v2 v3 v4
    "id1" "label_id1" 234 367 111 898
    "id2" "label_id2" 34 67 341 222
    "id3" "label_id3" 24 37 121 254
    "id4" "label_id4" 23 36 116 345

    When I run the following code:

    sxpose2, clear firstnames force varname
    drop in 1/2

    I get the result:
    _varname id1 id2 id3 id4
    v1 234 34 24 23
    v2 367 67 37 36
    v3 111 341 121 116
    v4 898 222 116 345

    My problem, that I cannot find a solution is: assign to variables id1, id2, id3 and id4 the labels from variable "name" from the initial dataset structure.
    id1 - label_id1, label_id2 etc.

    Thank you very much for your help!

  • #2
    Your existing structure with identifiers in observations and variables as, hmm, variables looks ideal.

    If you want to transpose it, I wouldn't use those programs even though I wrote one of them -- for, I guess, a very specific purpose that isn't yours at present.

    Code:
    clear 
    input str3 id str9 name v1 v2 v3 v4
    "id1" "label_id1" 234 367 111 898
    "id2" "label_id2" 34 67 341 222
    "id3" "label_id3" 24 37 121 254
    "id4" "label_id4" 23 36 116 345
    end
    
    local N = _N
    forval i = 1/`N' { 
        local name`i' = name[`i']
    }
    
    drop name 
    
    reshape long v, i(id) j(which)
    
    reshape wide v, i(which) j(id) string
    
    rename v* *
    
    forval j = 1/`N' { 
        label var id`j' "`name`j''"
    }
    
    d
    
    list

    Code:
    . 
    . d
    
    Contains data
      obs:             4                          
     vars:             5                          
    ---------------------------------------------------------------------------------------------
                  storage   display    value
    variable name   type    format     label      variable label
    ---------------------------------------------------------------------------------------------
    which           byte    %10.0g                
    id1             float   %9.0g                 label_id1
    id2             float   %9.0g                 label_id2
    id3             float   %9.0g                 label_id3
    id4             float   %9.0g                 label_id4
    ---------------------------------------------------------------------------------------------
    Sorted by: which
         Note: Dataset has changed since last saved.
    
    . 
    . list 
    
         +-------------------------------+
         | which   id1   id2   id3   id4 |
         |-------------------------------|
      1. |     1   234    34    24    23 |
      2. |     2   367    67    37    36 |
      3. |     3   111   341   121   116 |
      4. |     4   898   222   254   345 |
         +-------------------------------+

    Comment


    • #3
      Thank you, Nick!
      Well, although I managed to get a dataset that was close to an ideal one, it was not. The v1, v1 v3, v4 etc. have various names, as: si_dct, rcur_deb etc., but let's say that I could rename them, as there are not so many.
      On the other hand, the id1, id2 etc. are quite many and they are also named like: id601, id723_00, id223_01, id403 etc. So I cannot use the -forval- loops. That is why I preferred your -sxpose2- or even the -levelsof- and -xpose- (https://www.statalist.org/forums/forum/general-stata-discussion/general/1444912-question-about-xpose-and-sxpose). I don't need to bother about the approximated numeric observations.
      I used another solution in my previous work, but it was quite time consuming, as I was doing some partial manual editing in Excel. That is why I would prefer to find a better/quicker solution to this problem.

      Thank you again!

      Comment


      • #4
        Post a more realistic data example and I will try to generalise the code.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          Post a more realistic data example and I will try to generalise the code.
          Thank you, Nick, for your quick reply!
          After some thinking, I found the following solution, based on your -sxpose2-
          After -sxpose2-, the labels I need are on the first row (I was previously dropping it).
          I hope I'm not too wrong


          Code:
           clear  input str3 id str9 name si_dct rcur_deb tot_sum sf_zz
          "c603" "label_123" 234 367 111 898
          "c775_02" "label_3342" 34 67 341 222
          "c334_2" "label_565" 24 37 121 254
          "c091" "label_4456" 23 36 116 345
          end  
          sxpose2, clear firstnames force varname
          foreach var of varlist c*{  
              local labvar=`var'[1]  
              label var `var' "`labvar'"
          }
          drop in 1
          destring c*, replace
          I run this code inside a loop that imports multiple Excel files and merges them into a dataset.

          Thank you again!
          Last edited by Iulian Ihnatov; 21 Mar 2021, 15:14.

          Comment


          • #6
            sxpose2 isn't mine.

            More general code. This hinges on the fact that in the example the identifiers could be legal variable names.


            Code:
            clear  
            input str7 id str12 name si_dct rcur_deb tot_sum sf_zz 
            "c603" "label_123" 234 367 111 898 
            "c775_02" "label_3342" 34 67 341 222 
            "c334_2" "label_565" 24 37 121 254 
            "c091" "label_4456" 23 36 116 345 
            end  
            
            local N = _N
            forval i = 1/`N' { 
                local which`i' = id[`i'] + " " + name[`i']
            }
            
            drop name 
            
            ds id, not 
            
            rename (`r(varlist)') (v=)
            
            reshape long v, i(id) j(which) string 
            
            reshape wide v, i(which) j(id) string
            
            rename v* *
            
            forval j = 1/`N' {
                tokenize "`which`j''" 
                label var `1' "`2'"
            }
             
            
            d
            
            list

            Comment


            • #7
              Originally posted by Nick Cox View Post
              sxpose2 isn't mine.

              More general code. This hinges on the fact that in the example the identifiers could be legal variable names.


              Code:
              clear
              input str7 id str12 name si_dct rcur_deb tot_sum sf_zz
              "c603" "label_123" 234 367 111 898
              "c775_02" "label_3342" 34 67 341 222
              "c334_2" "label_565" 24 37 121 254
              "c091" "label_4456" 23 36 116 345
              end
              
              local N = _N
              forval i = 1/`N' {
              local which`i' = id[`i'] + " " + name[`i']
              }
              
              drop name
              
              ds id, not
              
              rename (`r(varlist)') (v=)
              
              reshape long v, i(id) j(which) string
              
              reshape wide v, i(which) j(id) string
              
              rename v* *
              
              forval j = 1/`N' {
              tokenize "`which`j''"
              label var `1' "`2'"
              }
              
              
              d
              
              list
              Thank you again and I apologize for the confusion!

              Credits:
              sxpose, Nicholas J. Cox, Durham University
              sxpose2, Stephan Huber, Hochschule Fresenius

              Comment

              Working...
              X