Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with aggregate date into annual observations

    Dear all,
    I have a panel data set A that contains firm year quarter data. Not all firms have all four quarters each year. I need to merge this data with another dataset B that is year firm observation. Hence, I must aggregate data set A into a yearly data first.

    Below is the first few observations in data set A. I have not used stata in recent years, so I do not remember much about date variables.


    Code:
    clear
    input long ID int date double PR
    1004 168  359.55
    1004 169                  0
    1004 170                  0
    1004 174                  0
    1004 175                  0
    1004 176                  0
    1004 177 60.17
    the first observation date is viewed as 2002q1.the last is 2004q4.

    Thank you,

    R





  • #2
    Okay, you want to aggregate data from the quarter level to the annual level in your first dataset so that you can merge the data with the second dataset by firm and year. What you probably want to do here is go through your variables one by one (possibly in a loop) then aggregate your variable to the annual level (using the bysort command prefix) by doing some calculation on the variable.

    How do you actually want to aggregate each variable in the dataset? There are a number of calculations you can use to aggregate the data. The mean comes to mind, but it really depends on the underlying measurement (sometimes a sum is more appropriate for example). What aggregation calculation is most appropriate for your data? Is it just the PR variable, or are there other variables you wish to aggregate? Can you use the same procedure on all of your variables?

    Comment


    • #3
      From your first thread back in 2014 https://www.statalist.org/forums/for...-two-variables it seems that you are Rochelle Zhang.

      Code:
      help dateime
      leads to the information that yofd(dofq()) pulls years out of Stata quarterly dates.

      Code:
      . di yofd(dofq(168))
      2002

      Comment


      • #4
        Hi Nick
        Yes, Rochelle Zhang is me. My apology for duplicating my login. Thank you.

        Hi Daniel

        You brought up some important points I shall consdier. I do have other variables to aggregate.

        The first step Iplan to do is to take the mean of PR variable by year.

        My sample has 350,000 observations (firm year quarter), about 1000 missing values for PR variable.

        If I use mean function, suppose quarter 1 is missing, but quarter 2, 3, 4 are non-missing, stata will take the mean of the three quarters, right?

        -Rochelle

        Comment


        • #5
          Yes, that's right.

          Comment

          Working...
          X