Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape Wide to Long Data with Linked Variables

    Hello,

    I hope everyone had a nice Christmas and a good start to the New Year.

    I apologize in advance, but I am trying to still learn STATA and I am having some issues with the reshape functionality that I hope the community can help me resolve.

    My data set was exported from Qualtrics which is an online survey platform. It exports a fairly clean data set with some minor recoding exercises. The issue that I am having, however, is that I have 4 groups of questions that have a fictitious name associated with each of the groups. Each group has 8 questions. See a sample of my data set below:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(g1 g2 g3 g4 s1 s2 s3 s4 age gender race gname sname)
    1 1 1 2 1 1 2 5 37 0 0 1 0
    2 2 3 4 1 1 5 5 25 0 0 0 1
    2 2 4 3 1 2 4 4 37 1 0 1 0
    2 4 4 2 3 4 4 4 25 0 0 1 0
    2 2 3 4 1 2 2 4 36 1 1 1 0
    2 3 2 2 4 2 2 4 39 0 0 1 0
    4 4 4 4 2 2 4 4 59 1 0 1 0
    2 2 3 2 1 1 1 2 24 0 0 1 1
    2 4 1 5 2 2 1 5 23 2 0 1 0
    5 4 5 4 5 4 4 5 28 0 0 1 1
    1 3 3 1 1 2 4 5 67 1 0 0 1
    4 3 5 4 5 4 5 3 29 0 0 0 1
    1 3 1 2 2 2 3 4 59 1 0 1 1
    4 4 4 2 1 1 1 2 60 1 1 1 0
    2 4 2 2 2 2 4 4 39 0 0 0 1
    3 4 2 4 1 1 1 5 37 0 2 1 1
    2 2 2 2 4 4 4 2 22 1 0 0 0
    1 1 1 1 1 1 1 4 34 2 0 1 0
    3 3 3 1 1 1 1 2 35 0 0 1 0
    2 2 2 2 1 1 1 1 34 0 0 0 0
    4 4 4 2 4 4 5 3 31 1 0 1 0
    2 3 4 4 2 3 4 3 38 0 0 0 0
    5 5 2 3 4 1 2 3 32 1 0 1 0
    3 3 3 3 2 3 3 3 29 0 1 0 0
    1 5 1 1 1 3 3 5 47 0 1 1 1
    3 3 4 1 3 3 4 2 35 0 0 1 1
    2 2 2 2 2 2 2 4 34 0 0 1 0
    1 2 1 2 1 1 3 4 35 1 2 0 0
    1 1 1 2 1 2 2 2 23 0 0 1 0
    end
    How should I can I reshape this to long format when the last two variables 'gname' and 'sname' need to be associated only with the respective block of questions e.g. 'g1'-'g4' and 's1-s4'? Basically I want to reshape the data so that every row is not just one participant but every row is each of individual decision with a column called "name" I think.

    Thanks for the help in advance!

  • #2
    I am not completely clear what you want or even how far it is a good idea.

    This may help, however.

    Code:
    gen id = _n 
    rename (gname sname gender age race) _= 
    reshape long g s, i(id) j(question)

    Comment


    • #3
      Basically my data structure has a set of controls ; [age, race, gender, abi, exp, fexp] where [gender] and [race] are factor variables. I have a set of survey questions where responses are given in the form of Likert scale (1-5). The survey questions are repeated across 4 blocks. Call the blocks G, H, S, C. So there are 32 questions in total. Each block was randomized a 'name' denoted as the variables [gname, sname, cname, hname] that are only associated with their corresponding block e.g. 'hname' relates to block H and so on. I would I structure this data frame? Right now, it produces it in wide format and I am trying to decide what is the most appropriate way to subindex this into long format. The questions are obviously the dependent variables and the independent variables are the names and the set of controls.

      Comment

      Working...
      X