how to generate columns from comma separated string variables

Janet Yang

Join Date: Jul 2015

Posts: 6
#1

how to generate columns from comma separated string variables

28 Jul 2015, 16:29

Hello,

I have responses from a questionnaire, where several questions have "select all that apply." These questions have been entered into STATA as a single string variable, separated by commas.

Is there a way to generate new columns for each unique response in the cell, and have the values appear in their specific newly generated columns?

For example, we have a list of activities numbered 1-11. A single cell may contain any combination of these numbers, such as (1,2,5,7,9).

We tried the following command: "split activities, parse(,) gen(activity)" and it produces columns titled Activity1 through Activity 11, but all numbers in the columns in order (ie. 1 will be in activity1, 2 will be in activity2, but then 5 will appear in activity3 etc. rather than under activity5).

Is there a way to do this? Thanks in advance for any advice.
Tags: None
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#2

28 Jul 2015, 17:02

I think you want something like this:

Code:

clear set more off input /// str25 answer "1,2,5,7,9" "5,3,9,2" end list split answer, parse(,) destring gen(activity) gen id = _n reshape long activity, i(id) j(act) drop act answer drop if missing(activity) gen act = 1 reshape wide act, i(id) j(activity) list

(Data in long form is usually more useful than it is in wide form.)

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
2 likes
Comment
Jorrit Gosens

Join Date: Jan 2015

Posts: 1019
#3

29 Jul 2015, 01:25

Alternatively, see this discussion here: http://www.statalist.org/forums/foru...rds-in-phrases
Comment
Aga Smith

Join Date: Jan 2021

Posts: 8
#4

11 Feb 2021, 09:39

I am getting an error after " reshape long activity, i(id) j(act)"
(note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
I/O error writing .dta file
Usually such I/O errors are caused by the disk or file system being full.
r(693);

May I ask for an advice how can I deal with it?

Thank you.

Originally posted by Roberto Ferrer View Post

I think you want something like this:

Code:

clear set more off input /// str25 answer "1,2,5,7,9" "5,3,9,2" end list split answer, parse(,) destring gen(activity) gen id = _n reshape long activity, i(id) j(act) drop act answer drop if missing(activity) gen act = 1 reshape wide act, i(id) j(activity) list

(Data in long form is usually more useful than it is in wide form.)
Comment

Announcement

how to generate columns from comma separated string variables

Comment

Comment

Comment