xpose/sxpose keep a variable observations to assign as labels to new variables

Iulian Ihnatov

Join Date: Mar 2016

Posts: 10
#1

xpose/sxpose keep a variable observations to assign as labels to new variables

21 Mar 2021, 11:05

Dear Statalisters,
I am trying to solve the following problem: assign the observations of a variable as labels to the new variables generated after -sxpose2-.
Thanks to your contribution, I managed to import multiple Excel files from a folder and clean the data.
I've got the following structure of the data:

id name v1 v2 v3 v4
"id1" "label_id1" 234 367 111 898
"id2" "label_id2" 34 67 341 222
"id3" "label_id3" 24 37 121 254
"id4" "label_id4" 23 36 116 345

When I run the following code:

sxpose2, clear firstnames force varname
drop in 1/2

I get the result:
_varname id1 id2 id3 id4
v1 234 34 24 23
v2 367 67 37 36
v3 111 341 121 116
v4 898 222 116 345

My problem, that I cannot find a solution is: assign to variables id1, id2, id3 and id4 the labels from variable "name" from the initial dataset structure.
id1 - label_id1, label_id2 etc.

Thank you very much for your help!
Tags: None

Nick Cox

Join Date: Mar 2014
Posts: 35696

21 Mar 2021, 12:01

Your existing structure with identifiers in observations and variables as, hmm, variables looks ideal.

If you want to transpose it, I wouldn't use those programs even though I wrote one of them -- for, I guess, a very specific purpose that isn't yours at present.

Code:

clear 
input str3 id str9 name v1 v2 v3 v4
"id1" "label_id1" 234 367 111 898
"id2" "label_id2" 34 67 341 222
"id3" "label_id3" 24 37 121 254
"id4" "label_id4" 23 36 116 345
end

local N = _N
forval i = 1/`N' { 
    local name`i' = name[`i']
}

drop name 

reshape long v, i(id) j(which)

reshape wide v, i(which) j(id) string

rename v* *

forval j = 1/`N' { 
    label var id`j' "`name`j''"
}

d

list

Code:

. 
. d

Contains data
  obs:             4                          
 vars:             5                          
---------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
---------------------------------------------------------------------------------------------
which           byte    %10.0g                
id1             float   %9.0g                 label_id1
id2             float   %9.0g                 label_id2
id3             float   %9.0g                 label_id3
id4             float   %9.0g                 label_id4
---------------------------------------------------------------------------------------------
Sorted by: which
     Note: Dataset has changed since last saved.

. 
. list 

     +-------------------------------+
     | which   id1   id2   id3   id4 |
     |-------------------------------|
  1. |     1   234    34    24    23 |
  2. |     2   367    67    37    36 |
  3. |     3   111   341   121   116 |
  4. |     4   898   222   254   345 |
     +-------------------------------+

Comment

Iulian Ihnatov

Join Date: Mar 2016

Posts: 10
#3

21 Mar 2021, 14:05

Thank you, Nick!
Well, although I managed to get a dataset that was close to an ideal one, it was not. The v1, v1 v3, v4 etc. have various names, as: si_dct, rcur_deb etc., but let's say that I could rename them, as there are not so many.
On the other hand, the id1, id2 etc. are quite many and they are also named like: id601, id723_00, id223_01, id403 etc. So I cannot use the -forval- loops. That is why I preferred your -sxpose2- or even the -levelsof- and -xpose- (https://www.statalist.org/forums/forum/general-stata-discussion/general/1444912-question-about-xpose-and-sxpose). I don't need to bother about the approximated numeric observations.
I used another solution in my previous work, but it was quite time consuming, as I was doing some partial manual editing in Excel. That is why I would prefer to find a better/quicker solution to this problem.

Thank you again!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#4

21 Mar 2021, 14:25

Post a more realistic data example and I will try to generalise the code.
Comment
Iulian Ihnatov

Join Date: Mar 2016

Posts: 10
#5

21 Mar 2021, 15:12

Originally posted by Nick Cox View Post

Post a more realistic data example and I will try to generalise the code.

Thank you, Nick, for your quick reply!
After some thinking, I found the following solution, based on your -sxpose2-
After -sxpose2-, the labels I need are on the first row (I was previously dropping it).
I hope I'm not too wrong

Code:

clear input str3 id str9 name si_dct rcur_deb tot_sum sf_zz "c603" "label_123" 234 367 111 898 "c775_02" "label_3342" 34 67 341 222 "c334_2" "label_565" 24 37 121 254 "c091" "label_4456" 23 36 116 345 end sxpose2, clear firstnames force varname foreach var of varlist c*{ local labvar=`var'[1] label var `var' "`labvar'" } drop in 1 destring c*, replace

I run this code inside a loop that imports multiple Excel files and merges them into a dataset.

Thank you again!

Last edited by Iulian Ihnatov; 21 Mar 2021, 15:14.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35696

21 Mar 2021, 15:25

sxpose2 isn't mine.

More general code. This hinges on the fact that in the example the identifiers could be legal variable names.

Code:

clear  
input str7 id str12 name si_dct rcur_deb tot_sum sf_zz 
"c603" "label_123" 234 367 111 898 
"c775_02" "label_3342" 34 67 341 222 
"c334_2" "label_565" 24 37 121 254 
"c091" "label_4456" 23 36 116 345 
end  

local N = _N
forval i = 1/`N' { 
    local which`i' = id[`i'] + " " + name[`i']
}

drop name 

ds id, not 

rename (`r(varlist)') (v=)

reshape long v, i(id) j(which) string 

reshape wide v, i(which) j(id) string

rename v* *

forval j = 1/`N' {
    tokenize "`which`j''" 
    label var `1' "`2'"
}
 

d

list

Comment

Iulian Ihnatov

Join Date: Mar 2016
Posts: 10

21 Mar 2021, 15:52

Originally posted by Nick Cox View Post

sxpose2 isn't mine.

More general code. This hinges on the fact that in the example the identifiers could be legal variable names.

Code:

clear
input str7 id str12 name si_dct rcur_deb tot_sum sf_zz
"c603" "label_123" 234 367 111 898
"c775_02" "label_3342" 34 67 341 222
"c334_2" "label_565" 24 37 121 254
"c091" "label_4456" 23 36 116 345
end

local N = _N
forval i = 1/`N' {
local which`i' = id[`i'] + " " + name[`i']
}

drop name

ds id, not

rename (`r(varlist)') (v=)

reshape long v, i(id) j(which) string

reshape wide v, i(which) j(id) string

rename v* *

forval j = 1/`N' {
tokenize "`which`j''"
label var `1' "`2'"
}


d

list

Thank you again and I apologize for the confusion!

Credits:
sxpose, Nicholas J. Cox, Durham University
sxpose2, Stephan Huber, Hochschule Fresenius

Announcement