wildcards in replace command

Catherine Perry

Join Date: Jul 2014

Posts: 19
#1

wildcards in replace command

25 Jul 2014, 05:03

Dear all users, this question was asked before from the stat list by another user. However the answer that was posted was use the egen anypsych= rowtotal (psych1-psych42). My problem is similar but the answer provided does not solve the issue. i want to create a variable fruit and under a food code an apple =120243 (code from nutritional software). i also have a string variable describing what the food is so i looked at the idea of possibly using regexm. As i have just reshaped my data there are 171 different food code variables so that is where I wanted to use the wildcard expression. if this post needs clarification please say
Tags: None
Joe Canner

Join Date: Mar 2014

Posts: 580
#2

25 Jul 2014, 07:08

Yes, your question needs more clarification. Please show a sample of your data and what you would like it look like after processing.

Also, as per Statalist etiquette (see the FAQ), you should consider using your full name. You can do this by going to "Contact Us" and requesting that your username be changed.
Comment
Catherine Perry

Join Date: Jul 2014

Posts: 19
#3

29 Jul 2014, 09:08

Hi Joe, thank you for your reply, apologies about my late response. I have a nutritional dataset that is in long format, due to the fact individual foods are listed. For example if you ate 20 items over the course of a day then you list these items for three days. i know how to reshape this data and have done so. however when i try to use egen command to do a rsum of, e.g. apples or even kilocalories i would have to list out the egen sum_kcal= rsum (kcal1 kcal2 ....kcal171) whereas i would prefer to use egen sum_kcal= rsum (kcal*) i have a lot of recoding like this to do so a quicker command would be great.
Comment
M. Cleves

Join Date: Jun 2014

Posts: 21
#4

29 Jul 2014, 09:15

CPerry,

This is easy if you order the variables in your dataset. For example if kcal1 kcal2 ....kcal171 are in order, then you can simply give the first and last variable name with a dash between them:
. egen sum_kcal= rsum (kcal1 - kcal171)

Mario
Comment
Catherine Perry

Join Date: Jul 2014

Posts: 19
#5

29 Jul 2014, 09:27

Hi Mario, yes that is another option that i tried however in the dataset the variables are ordered as kcal1 carbohydrate1 fat1 protein1.... kcal2 carb2 fat2 protein2...... kcal171 carb171 fat171 protein171. if there is a quick way to get all the kcal1 to 171 beside each other i would appreciate a command for this either?
Comment
M. Cleves

Join Date: Jun 2014

Posts: 21
#6

29 Jul 2014, 09:29

you can try -aorder-, that may work.

Mario
Comment
Catherine Perry

Join Date: Jul 2014

Posts: 19
#7

29 Jul 2014, 09:33

Thank you Mario and all who replied I wasn't sure what exactly I needed but using aorder has grouped them all together leaving it much easier to create my summary variables. Best Wishes
Catherine
Comment
Nick Cox

Join Date: Mar 2014

Posts: 33584
#8

29 Jul 2014, 09:58

I don't understand the difficulty here. This kind of statement should work fine, regardless of the order or position of the variables:

Code:

egen sum_kcal= rowtotal(kcal*)

(For some reason, Catherine is using the older syntax rsum() from Stata 8 and earlier, but that should not be material here. Conversely if Catherine is using an old version, she should tell us that: see FAQ Advice.)
Comment
M. Cleves

Join Date: Jun 2014

Posts: 21
#9

29 Jul 2014, 10:22

Correct Nick. I did not even check to see if

Code:

egen sum_kcal= rowtotal(kcal*)

worked. oops. Although I now remember that I use wildcards before in this context

Mario
Comment
Nick Cox

Join Date: Mar 2014

Posts: 33584
#10

29 Jul 2014, 10:30

Mario: Agreed. Indeed the puzzle is posed by Catherine's statements that it doesn't work; but presumably there was some other error.
Comment
Catherine Perry

Join Date: Jul 2014

Posts: 19
#11

30 Jul 2014, 11:27

Hi Nick and Mario, thank you for the reply, yes i had tried row total before i posed the question however i wanted to use rsum and this is when the wildcards did not work. I also want to get the mean kcal egen mean_kcal= mean (energykcal*). this syntax does not work, however with Mario's aorder command i can use kcal egen mean_kcal= mean (energykcal1-energykcal171). I finally wanted to this for only one food group at a time e.g. biscuits. I have just kept the group biscuits in the dataset and all the nutritional info, when in long format, then I switched to wide and used aorder then computed the mean. I am using Stata SE version 12. It is very time consuming process as there are many food groups and I need to do this for all, but aorder has helped in one aspect of analysis. There are probably far more efficient methods of doing the same though,
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4257
#12

30 Jul 2014, 12:19

why did you want to use "rsum"? as Nick explained this is old syntax (see #8 above)
Comment
Nick Cox

Join Date: Mar 2014

Posts: 33584
#13

30 Jul 2014, 13:29

We need to see exact code and precise explanations of what doesn't work.
Comment
M. Cleves

Join Date: Jun 2014

Posts: 21
#14

31 Jul 2014, 06:18

Rich: -rsum- with wildcards works in Stata 13.1:

Code:

set obs 3 foreach nn of numlist 1 2 3 4 5 6 7 8 9 10 { gen myvar`nn'=uniform() } egen myvar=rsum(myvar*) cl

:
Comment
Catherine Perry

Join Date: Jul 2014

Posts: 19
#15

31 Jul 2014, 10:13

Hi Nick, Yes I think all was down to using the old command rsum, I have moved to rowtotal, meantotal and they work with the wildcard (*). Thank you all very much for your time and help on this matter
Comment

Announcement

wildcards in replace command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment