frequency table in panel counting only id and not id-time

Elena DAg

Join Date: Nov 2014

Posts: 4
#1

frequency table in panel counting only id and not id-time

16 Feb 2017, 08:23

dear users,

I'm working on a dataset in panel format: id is the individual, time is the year. I need to describe some characterist of the dataset looking at the numbers of individuals: that is considering the frequency tables in which numbers of individuals are counted and not the number of individual-time.

For instance:
id time var1
1 2000 100
1 2001 200
1 2002 300

2 2001 200
2 2002 55

3 2000 150
3 2001 60
3 2002 70
3 2003 80

for example : tab time

i would that it coumes out:

2000 2
2001 3
2002 3
2003 1
...

then how can i obtain the mean for the var1 ?

thank you for any help!!
elena
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

16 Feb 2017, 08:43

Elena,
welcome to the list.
Do you mean something like this?

Code:

. use "http://www.stata-press.com/data/r14/nlswork.dta", clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. sort year

. tab year

  interview |
       year |      Freq.     Percent        Cum.
------------+-----------------------------------
         68 |      1,375        4.82        4.82
         69 |      1,232        4.32        9.14
         70 |      1,686        5.91       15.05
         71 |      1,851        6.49       21.53
         72 |      1,693        5.93       27.47
         73 |      1,981        6.94       34.41
         75 |      2,141        7.50       41.91
         77 |      2,171        7.61       49.52
         78 |      1,964        6.88       56.40
         80 |      1,847        6.47       62.88
         82 |      2,085        7.31       70.18
         83 |      1,987        6.96       77.15
         85 |      2,085        7.31       84.45
         87 |      2,164        7.58       92.04
         88 |      2,272        7.96      100.00
------------+-----------------------------------
      Total |     28,534      100.00

Kind regards,
Carlo
(Stata 19.0)

Comment

Elena DAg

Join Date: Nov 2014

Posts: 4
#3

16 Feb 2017, 09:30

Dear Carlo,
thanks a lot for your reply. actually i need to change example:
id time grup var1
1 2011 Gruppo D 0
1 2012 Gruppo D 0
1 2013 Gruppo D 0
2 2006 Gruppo C 1
2 2007 Gruppo C 1
2 2008 Gruppo C 1
2 2009 Gruppo C 1
2 2010 Gruppo C 1
2 2011 Gruppo C 1
2 2012 Gruppo C 1
2 2013 Gruppo C 1
3 2006 Gruppo C 1
3 2007 Gruppo C 1
3 2008 Gruppo C 1
3 2009 Gruppo C 1
3 2010 Gruppo C 1
3 2011 Gruppo C 1
3 2012 Gruppo C 1
3 2013 Gruppo C 1
4 2006 Gruppo D 0
4 2007 Gruppo D 0
4 2008 Gruppo D 0
4 2009 Gruppo D 0
4 2010 Gruppo D 0
4 2011 Gruppo D 0
4 2012 Gruppo D 0
4 2013 Gruppo D 0

obs=27

if i tab the following, i have:
var1 is a dummy 0/1
tab var1 , tab distribute 27 obs between 0 and 1
while i need to distribute id , that is the 4 id, 2 in 0 and 2 in 1.
i don't know if i am clear

thank you again forthe help
elena
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

16 Feb 2017, 11:27

Code:

help tabstat help tabsum
Comment
Elena DAg

Join Date: Nov 2014

Posts: 4
#5

16 Feb 2017, 11:55

Nick thanks for your replay. I tried this:
tabstat var1, by(grup) stat(n mean sd ...)
but it counts obs and not id.
i really need help, do you have any other suggestion ?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#6

16 Feb 2017, 12:01

So you need to pick out one observation per id:

Code:

egen flag = tag(id) tabstat var1 if flag, by(group) stat(n mean sd)

Do read -help egen- and find the -tag()- function for more details.

Important: Note also that tabulating a variable by id when there are multiple observations per id only makes sense if the variable is constant within id. If var1 can take on different values in different observations for the same id, then the results will be neither consistent nor sensible. Your example data in the first post of this thread suggests that var1 does in fact change among observations with the same id. So it isn't clear to me how to sensibly do what you appear to want to do. Perhaps I am misunderstanding what you want here.
Comment
Elena DAg

Join Date: Nov 2014

Posts: 4
#7

16 Feb 2017, 16:02

Clyde thanks! it works if the variable as you said is constant within id. however var1 could also be a variables varying among observations within the same id. is there an alternative to tag for this case?
thanks a lot!!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#8

16 Feb 2017, 16:30

Well, I don't understand what you want to get when the variable varies among observations with the same id. I can't make sense of it. Can you show an example of what you have (with var1 varying) and what the result you want would look like?
Comment
Remksax Wemereg

Join Date: Oct 2019

Posts: 3
#9

02 Oct 2019, 10:36

In similar situation dealing with panel data
the command
by ID: gen nyear = [_N]
creates the number of observations that appear nyear times in a panel data.
However I was looking for the number of IDs and how many years they appear in the dataset. How can I do that?
Example
id year var1
1 2000 3
2 2000 5
2 2001 6
3 2001 4
3 2002 6
3 2003 7
4 2000 2
4 2001 4
4 2002 7
4 2003 8
5 2001 6
5 2003 8
I am looking for
# of IDs freq(# of years the ID has data).
1 1
2 2
1 4
1 3
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#10

02 Oct 2019, 11:06

Code:

tab ID

Note. Your example is confusing to me. Identifier 1 occurs 1 time; 2 occurs 2 times in your example. But 3 occurs 3 times, and so after that the data you give and their summary part company.

Last edited by Nick Cox; 02 Oct 2019, 11:08.
Comment
Remksax Wemereg

Join Date: Oct 2019

Posts: 3
#11

02 Oct 2019, 12:07

Thank you very much for the reply.
The code you provided gives frequency of each ID. I am looking for aggregate by frequency (How many appear only once; how many appear twice and so forth.)

There are 4 periods in the panel data(from 2000-2003). ID 1 occurs only once; ID 2 Occurs twice; ID 3 occurs three times; ID 4 occurs four times and ID 5 occurs 2 times.
And what I want is the number of IDs and the number of their occurrence. Like the example I tried above.
Number of IDs Number of their occurrence
1 ( ID1 ) 1
2 ( ID 2 and ID 5) 2
3 ( ID 3) 1
4 (ID 4) 1
Total 5 (total number of IDs)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30118
#12

02 Oct 2019, 12:19

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte id int year byte var1 1 2000 3 2 2000 5 2 2001 6 3 2001 4 3 2002 6 3 2003 7 4 2000 2 4 2001 4 4 2002 7 4 2003 8 5 2001 6 5 2003 8 end isid id year, sort by id: gen n_years_this_id = _N egen flag = tag(id) tab n_years if flag

will give you what you want.

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
Comment
Remksax Wemereg

Join Date: Oct 2019

Posts: 3
#13

03 Oct 2019, 11:29

Thanks so much. It gave me what I was looking for.
Comment

Announcement

frequency table in panel counting only id and not id-time

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment