Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scalar in a foreach loop

    I need to obtain an average of averages from a series of variables. This is similar to a mean of a sampling distribution. So I use a foreach loop to summarise each variable in varlist. I then use a the scalar command to store each mean of each variable in varlist. Then I want to summarize the stored means to generate a mean. Here is what I am trying;
    Code:
    forear x in varlist var1 var2 var3 var4 var5{
    display "***********************************"
    display "`x'"
    display "***********************************"
    sum `x'
    scalar mean = r(mean)
    list mean
    di (mean1 mean2 mean3 mean4 mean5)/5
    }
    I do not think I am successful here for each time the loop runs, the scalar is replaced. Is there a way to automatically rename the scalars? Could there be a better way than what I am currently doing?

  • #2
    There are different ways to do this, depending upon your ultimate goal. Here's one.
    Code:
    sysuse auto
    local varlist mpg headroom trunk weight
    mata:
    A = .
    st_view(A, ., st_local("varlist"))
    meanofmeans = rowsum(mean(A))/cols(A)
    meanofmeans
    end
    exit

    Comment


    • #3
      Yes, you can do that by using tempnames for the scalars. In fact, you should almost always give scalars tempnames rather than permanent names because the scalar namespace and the variable namespace are one and the same. If you inadvertently name a scalar the same as an existing variable you are setting yourself up for a name clash and all sorts of really difficult to find and fix bugs in your code. So here's how that would work:

      Code:
      clear*
      sysuse auto
      
      foreach x of varlist price mpg headroom displacement trunk {
          display `"`x'"'
          tempname `x'
          summ `x', meanonly
          scalar ``x'' = r(mean)
      }
      
      scalar sum = 0
      scalar count = 0
      foreach x of varlist price mpg headroom displacement trunk {
          scalar sum = sum + ``x''
          scalar count = count + 1
      }
      display sum/count
      So each variable generates a scalar with a new name, and then you can later add them up and divide by how many there are to get the mean.

      That said, I wouldn't do it this way. Scalars are just that, scalars. But a series of scalars that you want to operate on as a unit is a vector, so you would be better off using a vector-based structure to hold your interim results. The vector-based structure that is ubiquitous in Stata is the variable. So I would do this task by saving the results as a variable in a -postfile-, and then reading in the -postfile- and getting the mean the easy way, using -summarize-. That looks like this:

      Code:
      clear*
      sysuse auto
      
      tempfile holding
      capture postutil clear
      postfile handle str32 varname float varmean using `holding'
      
      foreach x of varlist price mpg headroom displacement trunk {
          display `"`x'"'
          summ `x', meanonly
          post handle (`"`x'"') (`r(mean)')
      }
      postclose handle
      
      use `holding'
      summ varmean, meanonly
      display `r(mean)'
      Added: Crossed with #2 where the estimable Joseph Coveney shows a matrix-based approach using Mata. If you are comfortable with Mata, I think his solution is better.

      Comment


      • #4
        Thank you for these quite useful solutions. I have tried Coveney's mata approach but I would wish to know if there is a way to condition the matrix operaration with an "if" option. I would want to only do the meanofmeans on a sub sample identified using a dummy variable. I would want to think it is easier using the "if" command Clyde's answers too.

        Comment


        • #5
          Originally posted by George Kariuki View Post
          , , , I would wish to know if there is a way to condition the matrix operaration with an "if" option. I would want to only do the meanofmeans on a sub sample identified using a dummy variable.
          Yes, there is and it is easy There is an optional fourth argument to st_view() where you can provide the name of such a variable. For further information, type
          Code:
          help st_view()
          at Stata's command line.

          Comment


          • #6
            Let's note first for the innocent that the line

            Code:
            di (mean1 mean2 mean3 mean4 mean5)/5
            is sorely lacking in addition signs to do what is wanted, although George's code is schematic as none of the means so-called were created. Even though
            Code:
             . di %12.0f  (12 34 56 78 90)/5    246913578
            is iegal, it's not what you want. More crucially, there is a tacit assumptions that missings can just be ignored. So, the mean of 5 means doesn't weight for different numbers of non-missing values. George is evidently interested in using the if qualifier (which is neither a command (the if command is different and I guess not what is wanted) nor an option). Consider this
            Code:
            egen rowmean = rowmean(var1 var2 var2 var3 var4 var5)  su rowmean
            where any if qualifier you like can be added on the first line. Again, what happens with missing values needs care and attention. Joseph's and Clyde's approaches are both good and can easily be extended to subsets too.

            Comment


            • #7
              Here's how to extend Joseph's code to a subset defined by an indicator (with an independent check):


              Code:
              sysuse auto
              local varlist mpg headroom trunk weight
              local touse "foreign"
              
              mata:
              A = .
              st_view(A, ., st_local("varlist"), st_local("touse"))
              meanofmeans = rowsum(mean(A))/cols(A)
              meanofmeans
              end
              
              * check follows
              
              scalar sum = 0
              
              foreach v in mpg headroom trunk weight {
              su `v' if foreign, meanonly
              scalar sum = sum + r(mean)
              }
              
              di sum/4
              Last edited by Nick Cox; 09 Oct 2017, 04:37.

              Comment

              Working...
              X