Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating averages using -egen- and -collapse-

    Dear Statalisters,
    I am using Stata for Windows 12.1. I have created new1 and new2 in the attached dataset using the following commands:

    Code:
    bysort ccode isic: egen new1 = sum(y)
    /* sum of exports by ccode for each isic category across years */
    Code:
    bysort ccode isic year: egen new2 = sum(y)
    /* sum of exports by ccode for each isic category for different years */

    I would like to create the average values of y for each isic category over the years, as well as the average of x1. However, the following code:
    Code:
    collapse (mean) new2 x1, by(ccode isic)
    calculates the average value of exports for ccode over different years for each isic category (which is exactly what I want), but does the same thing for the other variable (x1) as well. I would like to calculate the average value of x1 over years and isic categories, i.e. value of x1 to not depend on isic. One way around is just using -collapse- over new2, and then have it merged with another dataset containing the average of x1. Interestingly, the average of x1 for ccode == "ARG" should be 2.319444 = (2.0416667 + 2.4166667 + 2.5)/3, but the following code:
    Code:
    collapse (mean) x1, by(ccode)
    gives the average for x1 as 2.3385417. Any observations and help on the above questions will be greatly appreciated.

    Best regards,
    Suryadipta.
    Attached Files

  • #2
    Personally, I prefer the -egen- way:

    Code:
    bys year ccode: egen mean_x1=mean(x1) // generate the mean of x1 over years by ccode
    
    bys ccode (year): egen grand_mean_x1=mean(mean_x1) if year[_n-1]!=year[_n]​ // that will generate the second one
    
    bys ccode: egen max_mean_x1=max( grand_mean_x1) //this will paste the result of the mean over all the ccode id

    Good luck

    Comment


    • #3
      Dear Oded,
      This is exactly what I wanted! I did try the first line of your code, but then was getting the wrong average with
      Code:
      bysort ccode: egen newvar = mean(x1)
      . It seems that the -if- condition that you have used with -egen- is the key! I believe that my code was just adding up all the values of x1 and then dividing that by the number of observations, thereby getting the incorrect average. Thank you so much for the help!

      Best regards,
      Suryadipta.

      Comment

      Working...
      X