Frequency in a variable

Erik van der Marel

Join Date: Mar 2015

Posts: 6
#1

Frequency in a variable

15 Mar 2015, 09:16

Hi all,
This seems to me a very easy thing to calculate, but I haven't been able to figure it out for hours now. I want to have in a separate variable: the count of the number of observations with the same value in a row of another variable. Preferably this would be combined with a "by" function. The command "tablulate" give a table, the command "tablepc" gives it in relative terms, but me I just want to have it count terms.

Isn't there a simple egen function or formula for this thing?

Many thanks!

Best wishes,
Erik van der Marel
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17854
#2

15 Mar 2015, 09:21

Erik:
did you take a look at -help egen-?

Kind regards,
Carlo
(Stata 19.0)
Comment
Erik van der Marel

Join Date: Mar 2015

Posts: 6
#3

15 Mar 2015, 09:41

Yep, I did, and couldn't figure it out. I am a experienced user of STATA but this is a tedious simple issue that I just can't resolve. I know it sounds too ludicrous for words. Please if you know the simple command, then you would make my day!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30357
#4

15 Mar 2015, 10:20

I don't understand your explanation. Perhaps you could show us some sample data (in a code block, of course!) along with hand-calculated results you would like to get from it?
Comment
ben earnhart

Join Date: May 2014

Posts: 1027
#5

15 Mar 2015, 12:01

Does something like this get you on the right track? Seems like there ought to be a more graceful way, but think it does what you want. Obviously, if you only want one value, you can take it out of the loop.

Code:

*=========make fake data clear set obs 1000 set seed 1971 gen var1=int(runiform()*8)+1 *=======get counts of each value forvalues i=1/8 { gen var1_`i'=1 if var1==`i' egen var1Count_`i'=count(var1_`i') drop var1_`i' } sum
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36058
#6

15 Mar 2015, 17:23

We are all still guessing, but I will start at a simpler end:

Code:

bysort x : gen count = _N
Comment

Erik van der Marel

Join Date: Mar 2015
Posts: 6

16 Mar 2015, 06:15

For both Clyde and Nick, here is what I am looking for. Below there is spreadsheet table taken from my STATA file. What I need is to compute the variable tab in an easy way. This variable tab counts the number of times a similar observation within com_4 is mentioned sorted by the variable ind. For instance, for the ind observation 111110 there are various observations within com_4. And although the observation 3334 is mentioned only once by ind 111110, the observation 3339 is mentioned 4 times. You will see that the variable tab in this example counts for that. Now I am looking for an easy way to compute this variable. Also because I need then to distinguish this variable for both when _fillin is 0 and one variable for when _fillin is 1. Please let me know if you have any clue!

com_4	ind	use	_fillin	tab
3334	111110	0.1	0	1
3335	111110	.	1	1
3336	111110	.	1	1
3339	111110	4.6	0	4
3339	111110	5.7	0	4
3339	111110	4.3	0	4
3339	111110	0.7	0	4
3341	111110	0.1	0	1
3342	111110	0	0	2
3342	111110	0	0	2
3343	111110	.	1	1
3344	111110	.	1	1
3345	111110	0.2	0	1
3346	111110	.	1	1

Ben, thanks a lot! I will go ahead and have a careful look at it. But true, an easy egen code or something would be great indeed!

Thanks a bunch guys!

Comment

Nick Cox

Join Date: Mar 2014

Posts: 36058
#8

16 Mar 2015, 06:32

This seems to me to be essentially what I suggested:

Code:

bysort com_4: gen tab = _N

The implication is that you want to do this separately by your ind variable, here a constant, but it's still counting subsets:

Code:

bysort ind com_4: gen tab = _N

and doing it separately by _fillin is then just the same idea. For a basic tutorial on by: see e.g. http://www.stata-journal.com/article.html?article=pr0004 and for the spelling "Stata" please see FAQ Advice here, Section 18.
Comment
Erik van der Marel

Join Date: Mar 2015

Posts: 6
#9

16 Mar 2015, 07:48

Thanks, Nick. I can't believe that this is so simple in the end. You should know that I am quite an experienced user of Stata (yes, I got it, thanks ;-)), but you might not believe after this query of mine. In anyway, a great thank you!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36058
#10

16 Mar 2015, 08:30

I think it can take quite a lot of practice to see that using by: is the answer and to think through to the details. Same goes for many areas of Stata which I should be much more fluent in.
Comment

Announcement