How to tokenize a stored result macro

ericmelse

Join Date: May 2014

Posts: 434
#1

How to tokenize a stored result macro

17 May 2025, 12:23

Dear Stata listers,

After running the user community contributed command uirt, like:

Code:

use alike, clear qui uirt v* di "`e(depvar)'" * that results in: v1 v2 v3 v4 v5 v6 v7 v8

I tried to tokenize the result macro e(depvar) that holds the names of dependent variables (items) separated by a space character, using:

Code:

tokenize e(depvar), parse() dis `1' v1 v2 v3 v4 v5 v6 v7 v8

which is not what I expect, i.e. the first (tokenized) word: v1
The help file states:
If parse() is not specified, parse(" ") is assumed, and string is split into words.
So, I assume that my line of code is technically correct (using parse(" ") also produces the whole string of words).
Nevertheless, the whole string is replicated by the first token and not the first word.

Can somebody explain what I am doing wrong here, or, if a coding alternative could produce from the stored result macro such items step by step (using a loop)?

http://publicationslist.org/eric.melse
Tags: None
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1389
#2

17 May 2025, 12:47

I think you just need

Code:

tokenize `e(depvar)', parse()

so that you can get

Code:

. dis "`1'" v1

Last edited by Hemanshu Kumar; 17 May 2025, 12:52.
Comment
daniel klein

Join Date: Mar 2014

Posts: 3847
#3

17 May 2025, 12:52

Here's what happens: tokenize works on string literals. "e(depvar)" (quotes added for emphasis) is just one word; hence `1' evaluates to "e(depvar)". display evaluates its arguments; hence e(depvar) evaluates to v1 v2 ...

You probably want

Code:

tokenize `e(depvar)'

Edit: crossed with #2 which cuts straight to the solution.

Last edited by daniel klein; 17 May 2025, 12:54.
Comment
ericmelse

Join Date: May 2014

Posts: 434
#4

17 May 2025, 13:22

Dear Hemanshu & Daniel,

Thank you for your advise. I ran each suggestion but the result is a rather mysterious digit 2 instead of v1, like:

Code:

. tokenize `e(depvar)' . dis `1' 2 * and . tokenize `e(depvar)', parse() . dis `1' 2 * my check thereafter of the content of the macro: . di "`e(depvar)'" v1 v2 v3 v4 v5 v6 v7 v8

So, the result macro appears to be unchanged, naturally, but tokenize still is a disappointment for me.
Any other suggestions or explanation why this is happening?

http://publicationslist.org/eric.melse
Comment
ericmelse

Join Date: May 2014

Posts: 434
#5

17 May 2025, 14:03

While searching further, I got inspiration from a nine year old post on stackoverflow Stata: How to delimit elements of a local macro by which I was able to get code running with the desired result:

Code:

forvalues i=1/`e(N_items)' { local element `: word `i' of `e(depvar)'' dis "`element'" } * which produces: . forvalues i=1/`e(N_items)' { 2. local element `: word `i' of `e(depvar)'' 3. dis "`element'" 4. } v1 v2 v3 v4 v5 v6 v7 v8

So, by using another uirt result macro e(N_items) the loop is controlled to stop at the last used variable/item (v8).
Now, I have the name (string) of each variable/item available for further use.
If anyone can think of a more elegant solution, well, I am interested to learn about it.

http://publicationslist.org/eric.melse
Comment
daniel klein

Join Date: Mar 2014

Posts: 3847
#6

17 May 2025, 14:23

Eric, display evaluates what you pass. When you pass v1, it will display the value of the first observation of v1, which is 2 in your data. tokenize works as expected. It's display that confuses you. Type

Code:

macro list

after tokenize to see the contents of local macros.
1 like
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1389
#7

17 May 2025, 23:09

Eric, all you need to do is, in the display command, to enclose the `1' in quotes as I did in #2, so it is parsed as dis "v1" and thus evaluated as a string. Without the quotes, it parses it as dis v1, which leads it to interpret the command as explained in #6.
Comment

ericmelse

Join Date: May 2014
Posts: 434

18 May 2025, 09:50

Dear Daniel & Hemanshu,
I really appreciate your effort and I follow your suggestion now running a replicable example:

Code:

webuse masc2, clear
qui uirt q*
di "`e(depvar)'"
tokenize `e(depvar)'
dis `1'
macro list

* Which results in:
. di "`e(depvar)'"
 q1 q2 q3 q4 q5 q6 q7 q8 q9
. dis `1'
1

. macro list
T_gm_fix_span:  1
S_level:        95
F1:             help advice;
F2:             describe;
F7:             save
F8:             use
S_ADO:          BASE;SITE;.;PERSONAL;PLUS;OLDPLACE
S_StataMP:      MP
S_StataSE:      SE
S_OS:           Windows
S_OSDTL:        64-bit
S_MACH:         PC (64-bit x86-64)
_9:             q9
_8:             q8
_7:             q7
_6:             q6
_5:             q5
_4:             q4
_3:             q3
_2:             q2
_1:             q1
S_FN:           https://www.stata-press.com/data/r19/masc2.dta
S_FNDATE:        1 Apr 2022 13:07

So, using dis `1' does not provide q1.
And, next, I tried dis `_1' but that does not result in anyting.

Next, I resort to using my back stop solution code:

Code:

forvalues i=1/`e(N_items)' {
  local element `: word `i' of `e(depvar)''
  dis "`element'"
}

* And this produces:
q1
q2
q3
q4
q5
q6
q7
q8
q9

So, I am happy that this does work but certainly I am a bit baffled about the particulars of this issue.

http://publicationslist.org/eric.melse

Comment

Hemanshu Kumar

Join Date: Mar 2015

Posts: 1389
#9

18 May 2025, 09:53

You just need

Code:

dis "`1'"

This is what I had suggested in #7 and #2. Sorry if it was not clear. You need to do this for the exact same reason that in your forvalues loop, you use dis "`element'" rather than dis `element'

Last edited by Hemanshu Kumar; 18 May 2025, 10:15.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35683
#10

18 May 2025, 10:08

Confession: On first reading this I hung back, guessing that uirt was doing something unusual and that it needed scrutiny of the code to find out what that was.

That was quite wrong, and indeed Hemanshu Kumar and daniel klein have pointed to the main issue, which is rather what is display doing here?

I can go a tiny bit beyond their explanations.

After tokenize the local macro 1 contains the variable name q1. So the syntax

Code:

di `1'

evaluates first as

Code:

di q1

and that is interpreted as

Code:

di q1[1]

i.e. the value of q1 in the first observation

The developers of Stata decided that asking to display a variable is not an error, but nevertheless you just get shown the value of the variable in the first observation.

As you have provided a reproducible example (thanks!) the data can be checked:

Code:

. webuse masc2, clear (Data from De Boeck & Wilson (2004)) . di q1 1 . di q1[1] 1

EDIT = #7
Comment

ericmelse

Join Date: May 2014
Posts: 434

#11

19 May 2025, 03:26

Dear Daniel & Hemanshu & Nick,

Thanks to you all for educating me about the intricacies of using tokenize to collect variable names from a result matrix (in my case from `e(depvar)' after using the command uirt).
I suppose my confusion originated from my long term experience of using tokenize to set a series of numbers or text strings for follow up usage.
To wrap up this post, my code example that includes all what is discussed above:

Code:

* Set up
ssc install uirt, replace // Stata module to fit unidimensional Item Response Theory models

* Example
webuse masc2, clear
qui uirt q*
di "`e(depvar)'"

* Set tokens manually (to compare with the coding below)
tokenize "q1 q2 q3 q4 q5 q6 q7 q8 q9"
forvalues i = 1/9 {
    dis "`1'"
    macro shift
}

* Set tokens by using the stored result macro
tokenize `e(depvar)'
* Get (display) each variable name
* Note that the forvalues range maximum is set manually (i.e. 9)
forvalues i = 1/9 {
    dis "`1'" // get variable name
    macro shift
}

* Same as above but now using the uirt model items scalar to set the forvalues range maximum
forvalues i=1/`e(N_items)' {
  local element `: word `i' of `e(depvar)''
  dis "`element'"
}

* Get (display) the first value of each variable by using the stored result macro
tokenize `e(depvar)'
forvalues i = 1/9 {
    dis `1'
    macro shift
}
* Same as above but now using the first case identifier between straight brackets [1]
tokenize `e(depvar)'
forvalues i = 1/9 {
    dis `1'[1]
    macro shift
}
* Same as above but now using the second case identifier between straight brackets [2]
tokenize `e(depvar)'
forvalues i = 1/9 {
    dis `1'[2]
    macro shift
}

http://publicationslist.org/eric.melse

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35683
#12

19 May 2025, 03:32

Seeing macro shift was evocative because before Stata 7 it featured heavily in looping through lists in a macro or a series of macros.

I think I remember a post from Alan Riley (now naturally the President of StataCorp) pointing out that it is fairly inefficient and you are better off avoiding it. You would be pushed to notice the inefficiency in a problem of this size.
Comment
ericmelse

Join Date: May 2014

Posts: 434
#13

19 May 2025, 09:33

Dear Nick,
Maybe my coding is of the more humble type or my projects are indeed too small to notice any inefficiency using macro shift to cycle through elements stored in memory after using tokenize.
Note that, instead of the most simple dis "`1'" in the example code below, I usually go through all sorts of manipulations that use the element(s)) provided by using tokenize.
But, certainly I am interested to learn what more efficient code then is instead of this code with macro shift:

Code:

* Set tokens manually tokenize "q1 q2 q3 q4 q5 q6 q7 q8 q9" forvalues i = 1/9 { dis "`1'" macro shift }

http://publicationslist.org/eric.melse
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1389
#14

19 May 2025, 10:00

Actually, I didn't quite understand the reason to do a macro shift here. Couldn't we simply do

Code:

* Set tokens manually tokenize "q1 q2 q3 q4 q5 q6 q7 q8 q9" forvalues i = 1/9 { dis "``i''" }

or am I missing something?
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35683
#15

19 May 2025, 14:50

Hemanshu Kumar 's code is essentially what I might do here.

Another useful approach is

Code:

local foo q1 q2 q3 q4 q5 q6 q7 q8 q9 local wc : word count `foo' forval w = 1/`wc' { di "`: word `w' of `foo''" }
Comment

Announcement