Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • operating sums of columns

    I guess this is about using stata as a very basic calculator--but I really need it to go that way.
    I have four variables, ses1-ses4, each records the number of participants who participated in 1,2,3,4 times of sessions (i.e. 14 in ses4 means there are 14 participants who participated in all four sessions)
    What I want, then, is the percentage of participants who participated in more than 1 session. I want to use the total percentage of participants participating in multiple sessions as a baseline for my analysis of other things.
    I would like to simply sum up all numbers in the columns ses2, ses3 and ses4, and divide that number by the sum of ses1-ses4.
    Of course, total(ses1 + ses2) will not work. adding up total(ses1) + total(ses2) won't work either.
    I tried to create a new variable, like
    gen mult = ses2 + ses3 + ses4
    gen total = ses1 + ses2 + ses3 + ses4
    then,
    total(mult)
    total(tot)
    but how can I get this perentage? the display function won't work as a calculator:
    display "Percentage = " total(mult) / total(tot)

    I used a super weird way below to get what I want, but this is so ugly that I really wonder if there is a much more straight forward way to do it. There should be, this is just elementary-level math.
    sum ses2
    local numerator = r(sum)
    sum ses3
    local numerator = `numerator' + r(sum)
    sum ses4
    local numerator = `numerator' + r(sum)

    sum ses1
    local denominator = r(sum)
    sum ses2
    local denominator = `denominator' + r(sum)
    sum ses3
    local denominator = `denominator' + r(sum)
    sum ses4
    local denominator = `denominator' + r(sum)

    display "Percentage = " %3.2f =100*`numerator'/`denominator'

  • #2
    You have all the main ideas. Strictly, display is a command, not a function.

    I would put them together differently. Here is also another method using Mata. The toy dataset used is

    Code:
         +---------------------------------------+
         | ses1   ses2   ses3   ses4   num   den |
         |---------------------------------------|
      1. |    1      2      3      4     6    10 |
      2. |    1      2      3      4     6    10 |
      3. |    1      2      3      4     6    10 |
      4. |    1      2      3      4     6    10 |
      5. |    1      2      3      4     6    10 |
         +---------------------------------------+

    Here is code to produce that dataset and the result, which is 60%, in two ways.


    Code:
    clear 
    set obs 5 
    forval j = 1/4 { 
        gen ses`j' = `j'
    }
    
    gen num = ses1 + ses2 + ses3 
    gen den = num + ses4 
    
    list 
    
    su num, meanonly 
    scalar sumnum = r(sum)
    su den, meanonly 
    di 100 * sumnum / `r(sum)'
    
    mata: st_view(data = ., ., "ses1 ses2 ses3 ses4")
    mata: 100 * sum(data[., 1..3]) / sum(data)
    Note that the generate commands are fragile to missing values. Something like this would ignore missing values:

    Code:
    gen num = cond(missing(ses1), 0, ses1) + other terms
    and egen, rowtotal() by default ignores missing values too. The Mata method ignores missing values.

    Comment


    • #3
      Thanks! I guess this is how stata works--can't expect it to be a simple calculator. But this looks much better. I guess I should look up Mata. I really appreciate your help!

      Comment

      Working...
      X