Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aggregating from micro level data to state level data

    Hi all,

    I have micro data for individuals that I want to transform to state level by different years. I will give an example to make myself clear. Assume the following structure of data.

    id state year aggregated variable
    a A 1 A1
    b A 1 A1
    c A 1 A1
    a B 1 B1
    b B 1 B1
    c B 1 B1
    a A 2 A2
    b A 2 A2
    c A 2 A2

    As you can see, I have created already the aggregated variable which has the same value across the same state at the same period of time given the data I had for individuals. As I want to do research on aggregate level by combining the aforementioned with some other state level data, I would prefer to have only a table that shows the state, the year and the aggregated variable only once. That is something like the following:

    state year aggregated variable
    A 1 A1
    B 1 B1
    A 2 A2
    How can someone achieve this?

    I would truly appreciate any thoughts about this problem.

    Thank you in advance.

  • #2

    In actual practice, before reducing to a single observation, I would verify that the aggregated variable really is constant within state-year combinations to avoid losing data ore using erroneous values. But the actual data reduction is very simple.
    Code:
    bysort state year (aggregated_variable): assert aggregated_variable[1] == aggregated_variable[_N]
    by state year: keep if _n == 1

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      In actual practice, before reducing to a single observation, I would verify that the aggregated variable really is constant within state-year combinations to avoid losing data ore using erroneous values. But the actual data reduction is very simple.
      Code:
      bysort state year (aggregated_variable): assert aggregated_variable[1] == aggregated_variable[_N]
      by state year: keep if _n == 1

      Thanks Clyde!

      I used 'bysort` to create the aggregated variable, but of course your suggestion is far better for not making any mistakes.

      P

      Comment

      Working...
      X