How can I combine/collapse/contract/aggregate/append my data to merge (groups of) observations, with different conditions for different variables?

rezza cav

Join Date: Jul 2022

Posts: 13
#1

How can I combine/collapse/contract/aggregate/append my data to merge (groups of) observations, with different conditions for different variables?

08 Jul 2022, 07:50

In my data (example from other post) I have multiple entries for the same population ("entity") but also for different years ("year") with different values in other variables.

1. With "egen tag" and "egen ndistinct" I can find out how many different "firms" each combination of "entity" and "year" (group?) has.
( I will have to do this not only for "firm" but also for "value" and other variables).

2. Now I would like to know the frequencies for each value in "ndistinct" over all the combinations of "entity" and "year".
(To get an idea of the variation within different variables and think about how to merge them in the most sensible way).

3. Last I need to make sure there is only one observation per combinations of "entity" and "year" and therefore combine (merge, collapse, append...?) the values of other variables. If there is a most frequent value, choose that one, if there is not, choose the highest or first, or in some other cases the mean or sum (to be defined for each variable).

I really struggle to find a way how to do that. I hope my example is clear.

Original data
Code:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte entity int year str1 firm float value 1 2010 "A" 15 1 2010 "A" 8 1 2010 "B" 12 1 2011 "B" 25 1 2012 "B" 8 2 2010 "A" 7 2 2011 "A" 5 2 2011 "A" 12 2 2011 "C" 13 2 2012 "A" 19 2 2012 "B" 25 2 2011 "B" 14 2 2012 "C" 18 2 2012 "D" 16 end sort entity year list, sepby(entity year)

1. Question
Code:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte entity int year str1 firm float value 1 2010 "A" 15 1 2010 "A" 8 1 2010 "B" 12 1 2011 "B" 25 1 2012 "B" 8 2 2010 "A" 7 2 2011 "A" 5 2 2011 "A" 12 2 2011 "C" 13 2 2012 "A" 19 2 2012 "B" 25 2 2011 "B" 14 2 2012 "C" 18 2 2012 "D" 16 end egen tag = tag(firm entity year) egen ndistinct = total(tag), by(entity year) sort entity year list, sepby(entity year)

2. Question -> This is the table I would like to obtain
Code:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte ndistinct freq 1 3 2 1 3 1 4 1 end sort ndistinct freq list, sepby(ndistinct freq)

3. Question?

Last edited by rezza cav; 08 Jul 2022, 07:54.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

08 Jul 2022, 11:15

This is one of 6 threads on which this question was posted today. It is answered at https://www.statalist.org/forums/for...iven-condition.
1 like
Comment

Announcement

How can I combine/collapse/contract/aggregate/append my data to merge (groups of) observations, with different conditions for different variables?

Comment