Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A faster way to do multiple operations?

    Hello everyone,

    I have 10 variables, 5 numeric (having answer choices 1 to 10) and 5 likert scale like multiple response questions. The numeric questions are in the long format so I just have 5 variables whereas the multiple response questions are in the wide format, so for each answer choice, I have a variable coded either ticked or not ticked.

    I'd like to do the following and would really appreciate a faster way to do these:

    1) recode the mutiple answer responses into scores based on a given set of guidelines (e.g. if respondent ticks option A of multiple response question 1, he gets a zero and so on..)

    2) All multiple responses have "other" option, and the coder is manually supposed to score those based on some guidelines. So should I ignore the "other" responses while scoring in part 1?

    3) add up the scores on all 10 questions to get a total self-efficacy score. I can use egen with rowtotal here but since respondents can choose more than one option, I am to pick the option with the highest score?

    4) If the respondent has NOT answered more than 3 of the 10 questions, I am to assign a missing value for the self-efficacy score.

    5) If the respondent misses 3 or less than 3 of the 10 questions, then I use the average of the questions he answered. So how do I incorporate information in part 4 and 5 while doing part 3?

    Additional Information: Some of the variables have missing values as well.
    Last edited by danishussalam; 31 Jan 2017, 01:09.

  • #2
    Just to note that 3) now seems covered by your later thread

    http://www.statalist.org/forums/foru...e-construction

    I will peel off 4) and 5). egen is your friend. The function rowmiss() counts missings, so you want, I think, to replace the result of rowmean() with missing if there are 4 or more missing.

    1) sounds like recode

    2) is up to you. Sometimes other categories are informative, sometimes not: I don't think we can tell you which applies to your data.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Just to note that 3) now seems covered by your later thread

      http://www.statalist.org/forums/foru...e-construction

      I will peel off 4) and 5). egen is your friend. The function rowmiss() counts missings, so you want, I think, to replace the result of rowmean() with missing if there are 4 or more missing.

      1) sounds like recode

      2) is up to you. Sometimes other categories are informative, sometimes not: I don't think we can tell you which applies to your data.
      Thanks Nick,

      There have been some progress since I posted those questions but I still have some confusion. Would appreciate your thoughts on them.

      So I have a total of 5 multiple response questions in wide format. I created 5 dummy variables taking 1 if the person answers the question (ticked) and 0 if he doesn't answer the question (not ticked). Be informed that not answered is not recorded as a missing value - I don't want to replace it with a missing value since my data already has some missing values and I wish to separate the two. Is there a efficient way to do this?

      I still don't get how do I replace the values for the final score with say a missing value if the person has skipped more than three questions. How do I find if a person has skipped more than three questions to begin with?

      Comment


      • #4
        Danishussalam:
        you can simply code missing answers and other missingness differently (e.g. 999 and 9999); -label-ling comes in handy for distinguishing between the two categories of missing codes.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Danishussalam:
          you can simply code missing answers and other missingness differently (e.g. 999 and 9999); -label-ling comes in handy for distinguishing between the two categories of missing codes.
          Right, but then the egen, rowmax would result in 99 instead of the actual score right? that's why initially I chose to leave it 0 (not ticked) together with 0 score for a response.

          Comment


          • #6
            Please follow http://www.statalist.org/forums/help#stata and post a data example using dataex (SSC). If your data are confidential, invent a realistic fake dataset. Then you can explain what you want and we can suggest code for what you want.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Please follow http://www.statalist.org/forums/help#stata and post a data example using dataex (SSC). If your data are confidential, invent a realistic fake dataset. Then you can explain what you want and we can suggest code for what you want.
              Hey Nick,

              You're right, I should have done that earlier. My apologies. Have attached an excerpt of the original dataset.

              The dataset have 5 numeric variables (sec11_1, sec11_2, sec11_3, sec11_4 & sec11_5) in long format and 5 multiple response questions in wide format. They are easy to spot in the data file as every numeric question has a followup multiple response question.

              I am to calculate a score from these based on the following steps:

              1) recode individual responses into points for all questions
              2) add the points for all 10 questions to get a score. In case of multiple responses, I need to choose the response with the highest point.

              This is what I've done so far.

              1) created dummy variables for all questions based on whether responded answered the question or not.
              2) used egen with rowmax to create a variable having the highest response for multiple choice questions.
              3) manually assigned other values based on individually responses.

              The only thing remaining is add all the rows to get one composite score. Now I understand I can do this with egen, rowtotal, but how to incorporate the following additional information into the command?

              1) If the respondent skipped more than 3 questions, I replace the final composite score with a missing value
              2) if the respondent skipped three or less than 3 questions then I take an average of the questions he answered.

              Additional Information:

              - Some of the variables have missing values and i wish to keep them.
              - In stata, multiple responses are encoded as ticked (1) not ticked (0)
              - Some of the multiple responses give a point of 0 when a person chooses a particular option, doing so mix those observations with not ticked (0). How can I tackle that?
              Attached Files
              Last edited by danishussalam; 31 Jan 2017, 05:38.

              Comment


              • #8
                Sorry, but that is not what I asked for and the very section I referred you to explains why it is a bad idea here.

                Here it is again: http://www.statalist.org/forums/help#stata

                I'll add that data, column headers or metadata in a spreadsheet don't make clear what are your variable names, variable types, display formats, or value labels in Stata,

                Comment


                • #9
                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int student_id float(treatment period) byte(sec11_1 sec11_1antpoints sec11_1apoints sec11_1antmoney sec11_1amoney) float sec11_1aother str83 sec11_1aotherdesc
                   4 0 0  3 1 0 0 0 0 ""                                        
                   4 0 1  5 0 0 1 0 0 ""                                        
                   5 0 1  8 0 1 0 1 0 ""                                        
                   5 0 0  9 0 0 0 1 0 ""                                        
                   6 0 0  5 0 0 1 0 0 ""                                        
                   6 0 1  5 0 0 1 0 0 ""                                        
                   7 0 0  5 0 0 1 0 0 ""                                        
                   7 0 1  7 0 0 0 1 0 ""                                        
                   8 0 0  2 0 0 1 0 0 ""                                        
                   8 0 1  6 0 0 1 0 0 ""                                        
                  12 0 0 10 0 0 1 0 0 ""                                        
                  12 0 1  1 0 0 1 0 0 ""                                        
                  13 0 1  2 0 0 1 0 0 ""                                        
                  13 0 0  5 0 0 1 0 0 ""                                        
                  15 0 1  8 0 0 0 1 0 ""                                        
                  15 0 0  6 0 0 1 0 0 ""                                        
                  20 0 0  5 0 0 0 1 0 ""                                        
                  20 0 1  3 1 0 0 0 0 ""                                        
                  21 0 1  5 0 0 1 0 0 ""                                        
                  21 0 0  5 0 0 0 0 0 ""                                        
                  24 0 0  6 0 0 0 0 0 ""                                        
                  24 0 1  7 0 1 0 0 1 "BECAUSE I WILL GET HIGH POINTS"          
                  26 0 1  2 0 0 1 0 1 "AM NOLONGER AT SCHOOL"                  
                  26 0 0  6 0 0 1 0 0 ""                                        
                  27 0 1  2 0 0 1 0 0 ""                                        
                  27 0 0  5 0 0 1 0 0 ""                                        
                  28 0 1  7 0 1 0 1 0 ""                                        
                  28 0 0  6 0 0 1 0 0 ""                                        
                  29 0 0  6 0 0 1 0 0 ""                                        
                  29 0 1  6 0 1 1 0 0 ""                                        
                  30 0 1  3 1 0 0 0 0 ""                                        
                  30 0 0  5 1 0 0 0 0 ""                                        
                  31 0 0  5 1 0 0 0 0 ""                                        
                  31 0 1 10 0 1 0 0 1 "I PASSED O'LEVEL ,A'LEVEL NEXT"          
                  32 0 1  3 0 0 1 0 0 ""                                        
                  32 0 0  6 0 0 1 0 0 ""                                        
                  33 0 0  5 0 0 1 0 0 ""                                        
                  33 0 1  2 0 0 1 0 0 ""                                        
                  34 0 1  5 0 0 0 1 0 ""                                        
                  34 0 0  . 0 0 0 0 1 "BECAUSE I MIGHT BE IN THE BUSINESS FIELD"
                  35 0 0  . 0 0 0 0 0 ""                                        
                  35 0 1 10 1 0 1 0 0 ""                                        
                  36 0 0  7 0 0 1 0 0 ""                                        
                  36 0 1  5 0 0 1 0 1 "WILL GO THROUGH MY EDUCATION"            
                  37 0 0  2 0 0 1 0 0 ""                                        
                  37 0 1  . . . . . 0 ""                                        
                  38 0 0 10 0 0 0 1 0 ""                                        
                  38 0 1  8 0 0 0 1 0 ""                                        
                  39 0 1  9 0 0 0 1 0 ""                                        
                  39 0 0  6 0 0 1 0 0 ""                                        
                  end
                  label values treatment treatment
                  label def treatment 0 "CONTROL", modify
                  label values period period
                  label def period 0 "Baseline", modify
                  label def period 1 "Midline", modify
                  label values sec11_1antpoints sec11_1antpoints
                  label def sec11_1antpoints 0 "NOT TICKED", modify
                  label def sec11_1antpoints 1 "TICKED", modify
                  label values sec11_1apoints sec11_1apoints
                  label def sec11_1apoints 0 "NOT TICKED", modify
                  label def sec11_1apoints 1 "TICKED", modify
                  label values sec11_1antmoney sec11_1antmoney
                  label def sec11_1antmoney 0 "NOT TICKED", modify
                  label def sec11_1antmoney 1 "TICKED", modify
                  label values sec11_1amoney sec11_1amoney
                  label def sec11_1amoney 0 "NOT TICKED", modify
                  label def sec11_1amoney 1 "TICKED", modify
                  label values sec11_1aother sec11_1aother
                  label def sec11_1aother 0 "NOT TICKED", modify
                  label def sec11_1aother 1 "TICKED", modify

                  I think that will suffice?

                  Comment

                  Working...
                  X