Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Summation over particular values of a panel

    Hi,

    I am coding data for a data-set in panel format. I am trying to trim the codes and avoid splitting and merging back the data-sets to make codes neater and more readable.

    An example of a panel in the data is:
    year Precep. Sum before 2003
    2000 20
    2001 25
    2002 15
    2003 10 60
    2004 10 60
    2005 15 60
    I am looking for a simple code that produces the third column which contains the summation of values of the second column over the years 2000 to 2002.

    Thanks in advance,
    Navid

  • #2
    You don't say so, but I'm guessing you actually want this for each of the panel members. So let me assume you have a variable named panelvar that identifies the panels.

    Code:
    by panelvar, sort: egen sum_before_2003 = total(Precep*(inrange(year, 2000, 2002)))
    replace sum_before_2003 = . if year < 2003
    By the way, you cannot have a period as part of a variable name in Stata. Nor can variable names have embedded spaces.

    Comment


    • #3
      Thanks, Clyde!

      Yes, I want to do this for each panel member. And, the panel is unbalanced. So, is there a way to make the lower bound of "inrange" flexible as panels may start at different years (for example, 1999)?

      Thanks,
      Navid
      Last edited by Navid Asgari; 01 Mar 2015, 16:28.

      Comment


      • #4
        Yes, but you need to be more specific about what you want. Do you want the sum of all years from whatever the start is until 2003, or do you want the first three years? If the first three years, are they always three consecutive years, or can there be gaps? If there are gaps, do you want the first three years that happen to be there?

        I'll just show you one example here. If you want the sum from whatever the first year is to 2003, it's this:

        Code:
        by panelvar, sort: egen sum_before_2003 = total(Precep*(year < 2003))
        replace sum_before_2003 = . if year < 2003
        By the way, the above code will calculate sum_before_2003 as zero for any panel whose first year is 2003 or later. If you would prefer that to be a missing value, you can just stick -, miss- on the end of the -egen- command.

        Comment


        • #5
          Thanks,

          I am trying this for average values by replacing "total" with mean. But, I get different results from when I use my older,longer command.

          How does mean function treat missing values. Does it count them in the number of cases?

          Comment


          • #6
            It's not that -mean- treats missing values differently from -total-. The issue is that Precep*(year < 2003) evaluates to zero (not missing) for years before 2003. When you include those in a -total-, they don't contribute, but zeroes do alter a mean. So what you need is:

            Code:
            by panelvar, sort: egen mean_before_2003 = mean(cond(year < 2003, Precep, .))
            replace mean_before_2003 = . if year < 2003

            Comment


            • #7
              For a wider discussion in this territory, see http://www.stata-journal.com/article...article=dm0055

              Comment

              Working...
              X