Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • multiplying and summing quantity and price for 112 items

    I have a dataset which includes consumption quantity and price per unit for 112 food items. Therefore, each observation has 112 variables for quantity and 112 variables for price. I want to calculate total expenditures on food by multiplying each quantity consumed with its corresponding price. I could write 112 lines of syntax, but I was wondering whether there is a faster way of doing this? I tried using the foreach command but it seems to be limited to multiplying each variable with a constant instead of each variable with a different variable. The data is structured as follows:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(i1_1_5 i1_2_5 i1_3_5 i1_4_5 i1_5_5 i1_1_4) int(i1_2_4 i1_3_4 i1_4_4 i1_5_4)
      5 . . . .  8000 . . .    .
    3.5 . . . 2  7600 . . . 2500
      7 . . . .  9000 . . .    .
      4 . . . .  7600 . . .    .
      7 . . . .  9000 . . .    .
      5 . . . . 10000 . . .    .
      7 . . . .  8000 . . .    .
      7 . . . 7  9000 . . . 2000
      7 . . . 2     . . . . 2000
     10 . . . 4  9500 . . . 2600
    end
    I want to multiply i1_1_5 (quantity) with i1_1_4 (price) for each food item, so i1_2_5 multiplied by i1_2_4 and so on for each of the 112 items.

    Does anybody have any advice?

    Thanks



  • #2
    I have plenty of advice, starting with to use variable names that mean something! . As it is, anyone reading this (us, you in a week's time or a year's time, anybody else) has to learn that name suffix _4 means price and _5 means quantity. Use _p and _q or anything more instantly understandable and memorable. See help on rename groups

    If I understand you correctly, you want a rowwise sum over 112 products like this

    Code:
    gen foodexp = 0
    
    quietly forval j = 1/112 {
         replace foodexp = foodexp +  (i1_`j'_4 * i1_`j'_5) if  (i1_`j'_4 * i1_`j'_5) < .
    }
    It's important not to let any missings mess up the calculation. Your missings I take it are really zeros. (You could also reach for mvencode to fix them.)

    The trickiest command can be rewritten in various ways e.g.

    Code:
    replace foodexp = foodexp +  max(i1_`j'_4 * i1_`j'_5, 0)
    
    replace foodexp = foodexp +  min(i1_`j'_4 * i1_`j'_5, 0)
    It may seem odd that min() and max() give the same answer! But the principle is that both functions ignore missings, so given missing arguments, the largest or smallest other argument, here zero in both cases, is always returned.

    Much more rowwise technique at http://www.stata-journal.com/sjpdf.h...iclenum=pr0046
    Last edited by Nick Cox; 15 Apr 2016, 04:59.

    Comment


    • #3
      Great, thanks! Sorry for the variable names..

      Comment


      • #4
        Good, but better variable names are mostly for your benefit. This can bite us all. I have to revise a 2007 program the other day and I would have benefitted from commenting in 2007 on my naming choice, logical though it was.

        Comment

        Working...
        X