Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem generating a summed variable

    hi all,

    I am pretty new to Stata and having difficulty figuring out the syntax for scoring a questionnaire so I thought I would ask here.

    I have an instrument with 6 questions. For each question, I have converted the answers from string to a numerical value (example below)

    generate brs_6=5 if itendtotakealongtimetogetoverset=="Strongly Disagree"
    replace brs_6=4 if itendtotakealongtimetogetoverset=="Disagree"
    replace brs_6=3 if itendtotakealongtimetogetoverset=="Neutral"
    replace brs_6=2 if itendtotakealongtimetogetoverset=="Agree"
    replace brs_6=1 if itendtotakealongtimetogetoverset=="Strongly Agree"

    I would like to generate a new variable brs_total that sums the scores from question 1 through 6 (brs_1 to brs_6) for each observation.

    I tried to use something like

    foreach recordid {
    generate brs_score = total (brs_1,brs_2,brs_3,brs_4,brs_5,brs_6)}

    but this was definitely wrong.

    It seems a deceptively easy task but I am unable to figure it out.

    I would appreciate any and all guidance on this.

    Thank you in advance!

  • #2
    You don't need a loop for this. Just the following may suffice:
    Code:
    generate brs_score = brs_1 + brs_2 + brs_3 + brs_4 + brs_5 + brs_6
    Note that for any particular observation (row), if any of the six component scores are missing, the brs_score will also be missing. If you would like missing values to be treated as zeros, you could instead do
    Code:
    egen brs_score = rowtotal(brs_1 brs_2 brs_3 brs_4 brs_5 brs_6)
    Last edited by Hemanshu Kumar; 06 Feb 2025, 18:57.

    Comment


    • #3
      Use egen function instead, and no commas in total(arguments).
      Code:
      egen brs_score = rowtotal(brs_1 brs_2 brs_3 brs_4 brs_5 brs_6)

      Comment


      • #4
        Furthermore, you may want to use encode to create your numeric variables, like so:
        Code:
        label define myscale 1 "Strongly Agree" 2 "Agree" 3 "Neutral" 4 "Disagree" 5 "Strongly Disagree"
        
        encode itendtotakealongtimetogetoverset, generate(brs_6) label(myscale)
        Aside from being more concise code, this has the advantage that the resulting variable has value labels attached to the numbers. This can be very useful when graphing (e.g. making bar plots) or tabulating the variable.

        Comment


        • #5
          Thank you all very much! Your recommendations worked!

          Comment


          • #6
            As a slight polish to Hemanshu Kumar 's suggestion in #4, since the response options are those of the classic 5-point Likert scale, I would chose to name the value label likert5 instead of myscale. That way when you come to the data set after a period of absence, you can immediately identify all of the 5-point Likert scale items at a glance in the output of -describe- or -ds, has(vallabel likert5)-. I always like to give things names that are descriptive, when possible.

            Comment

            Working...
            X