Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Randomized items

    Dear Stata experts,

    I have recently finalized a data collection from an on-line platform with embedded Qualtrics survey. My issue pertains to a one particular matrix-type question with 10 sub-items as rows, which have been randomized and assigned a Y/N value by the respondents (column).

    The randomization resulted in a different order of presentation of same single items to each respondent. Y/N values listed under the same column (such as one of ten column resulting from the split) pertain therefore to different items, The order of items displayed for each respondent is provided in a summary column (all items separated by vertical bars).
    I attempt to illustrate the structure of the data below :

    Respondents no Q1_1 Q1_2 Q1_3 .... Q3_DO (display order)
    1 Yes Yes No I find it relaxing | I am too lazy to do it | other
    2 Yes No Yes I am too lazy to do it | Other | I find it relaxing
    ....

    My goal is to organize the respondents' answers so that the Y/N values are assigned to a column with the corresponding item in a consitent way across all observations. This would result in the use of the current sub-items (listed in the DO column) as independent column labels.
    I do not know how to approach this issue and tried several approaches (first splitting the DO variable into multiple variables, concatenating the item to the related value by order of display, possibly reshaping the data based on such inputs..) but unfortunately failed.
    I would be very grateful for your help.
    With kind regards
    Agnieszka
    Post-Doc researcher
    CBS, Denmark

  • #2
    [code]
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float respondent str3(q1_1 q1_2 q1_3) str54 display_order
    1 "yes" "yes" "no" "i find it relaxing | i am too lazy to do it | other"
    2 "yes" "no" "yes" "i am too lazy to do it | other | i find it relaxing"
    end

    split display_order, gen(sequence) parse("|")
    foreach v of varlist sequence* {
    replace `v' = trim(itrim(`v'))
    }
    drop display_order
    reshape long sequence q1_, i(respondent) j(_j)
    by respondent (sequence), sort: replace _j = _n
    reshape wide
    [code]
    Note: this code is brittle, in the sense that minor typos, and inconsistencies in capitalization, spacing, spelling errors, extra or missing blanks, etc. in the "display order" variable will break this code. But I don't think it is feasible to write code that will overcome those limitations of the data. So you need to thoroughly clean that variable to assure it is 100% internally consistent and correct before using this.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Clyde's copy-and-paste went awry. I believe the following is what he intended his code to look like, and I've included a listing of the final data, because I ran this to see how it worked.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float respondent str3(q1_1 q1_2 q1_3) str54 display_order
      1 "yes" "yes" "no" "i find it relaxing | i am too lazy to do it | other" 
      2 "yes" "no" "yes" "i am too lazy to do it | other | i find it relaxing"
      end
      
      split display_order, gen(sequence) parse("|")
      foreach v of varlist sequence* {
          replace `v' = trim(itrim(`v'))
      }
      drop display_order
      reshape long sequence q1_, i(respondent) j(_j)
      by respondent (sequence), sort: replace _j = _n
      reshape wide
      Code:
      . list, clean noobs
      
          respon~t   q1_1                sequence1   q1_2            sequence2   q1_3   sequen~3  
                 1    yes   i am too lazy to do it    yes   i find it relaxing     no      other  
                 2    yes   i am too lazy to do it    yes   i find it relaxing     no      other

      Comment


      • #4
        Thanks, William Lisowski. I see it was not the copy/paste that went awry, it was that I messed up the closing code delimiter by omitting the / character.

        Anyway, thanks for putting it right.

        Comment


        • #5
          Dear Clyde and William,

          Thank you so much for the solution! I just tried it and it worked perfectly fine. Also apologies for not using the dataex, I will use it in the future, I am also exploring its features now.
          Have a nice day and again thank you
          Best
          Aga

          Comment

          Working...
          X