Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to generate columns from comma separated string variables

    Hello,

    I have responses from a questionnaire, where several questions have "select all that apply." These questions have been entered into STATA as a single string variable, separated by commas.

    Is there a way to generate new columns for each unique response in the cell, and have the values appear in their specific newly generated columns?

    For example, we have a list of activities numbered 1-11. A single cell may contain any combination of these numbers, such as (1,2,5,7,9).

    We tried the following command: "split activities, parse(,) gen(activity)" and it produces columns titled Activity1 through Activity 11, but all numbers in the columns in order (ie. 1 will be in activity1, 2 will be in activity2, but then 5 will appear in activity3 etc. rather than under activity5).

    Is there a way to do this? Thanks in advance for any advice.

  • #2
    I think you want something like this:

    Code:
    clear
    set more off
    
    input ///
    str25 answer
    "1,2,5,7,9"
    "5,3,9,2"
    end
    
    list
    
    split answer, parse(,) destring gen(activity)
    gen id = _n
    
    reshape long activity, i(id) j(act)
    drop act answer
    drop if missing(activity)
    
    gen act = 1
    reshape wide act, i(id) j(activity)
    
    list
    (Data in long form is usually more useful than it is in wide form.)

    You should:

    1. Read the FAQ carefully.

    2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

    3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

    4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

    Comment


    • #3
      Alternatively, see this discussion here: http://www.statalist.org/forums/foru...rds-in-phrases

      Comment


      • #4
        I am getting an error after " reshape long activity, i(id) j(act)"
        (note: j = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15)
        I/O error writing .dta file
        Usually such I/O errors are caused by the disk or file system being full.
        r(693);

        May I ask for an advice how can I deal with it?

        Thank you.





        Originally posted by Roberto Ferrer View Post
        I think you want something like this:

        Code:
        clear
        set more off
        
        input ///
        str25 answer
        "1,2,5,7,9"
        "5,3,9,2"
        end
        
        list
        
        split answer, parse(,) destring gen(activity)
        gen id = _n
        
        reshape long activity, i(id) j(act)
        drop act answer
        drop if missing(activity)
        
        gen act = 1
        reshape wide act, i(id) j(activity)
        
        list
        (Data in long form is usually more useful than it is in wide form.)

        Comment

        Working...
        X