identifying a common observation across the groups

kamalesh pahurkar

Join Date: Nov 2022

Posts: 32
#1

identifying a common observation across the groups

08 Jul 2023, 03:32

How can I identify the common observations across the groups. I have a group codes under which there are different codes. I want to identify the codes that are common across the groups.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17733
#2

08 Jul 2023, 08:50

Kamalesh:
If I got you right, you may want to consider something along the following lines:

Code:

. use "https://www.stata-press.com/data/r17/nlswork.dta" (National Longitudinal Survey of Young Women, 14-24 years old in 1968) . egen wanted=count( year) if year==70

Last edited by Carlo Lazzaro; 08 Jul 2023, 08:55.

Kind regards,
Carlo
(Stata 19.0)
Comment
Eric Makela

Join Date: Aug 2022

Posts: 45
#3

08 Jul 2023, 15:33

Hi Kamalesh,

This is effective for creating categorical variables for what you describe, even if not terribly elegant. The variable 'group3' identifies variables that are common across the first two groups. Note there are none. I don't wish to speak for Carlo, but he would suggest a more detailed description of your problem is necessary to properly contribute to Statalist and solve your actual problem.

Code:

sysuse auto gen wtclass=(weight<2700)+2*(weight>=2700 & weight<3500)+3*(weight>=3500) gen mpgclass=(mpg<20)+2*(mpg>=20 & mpg<28)+3*(mpg>=28) egen group1=group(wtclass foreign) egen group2=group(make mpgclass) egen group3=group(group1 group2)

Last edited by Eric Makela; 08 Jul 2023, 15:35. Reason: Definitions
1 like
Comment
kamalesh pahurkar

Join Date: Nov 2022

Posts: 32
#4

10 Jul 2023, 03:34

Hi look, I have the following kind of data structure. I just want to keep or identify the product codes that are common across all the company code.
company code product code

11 115

11 117

11 119

11 112

12 117

12 115

12 119

16 108

16 117

16 119

16 111

16 115

16 120

14 222

14 117

14 115

14 119
Comment

Hemanshu Kumar

Join Date: Mar 2015
Posts: 1478

10 Jul 2023, 03:59

Perhaps this?

Code:

clear
input int(company_code product_code)
11    115
11    117
11    119
11    112
12    117
12    115
12    119
16    108
16    117
16    119
16    111
16    115
16    120
14    222
14    117
14    115
14    119
end

egen int num_companies = count(company_code), by(product_code)
levelsof company_code, local(companies)
local tot_companies: word count `companies'

gen byte common_to_all = (num_companies == `tot_companies')

Then you can do things like

Code:

. tab product_code if common_to_all

product_cod |
          e |      Freq.     Percent        Cum.
------------+-----------------------------------
        115 |          4       33.33       33.33
        117 |          4       33.33       66.67
        119 |          4       33.33      100.00
------------+-----------------------------------
      Total |         12      100.00

This code assumes that each product_code appears only once for a given company_code. This is the case in your example, but if it is not true in the dataset, let me know and we'll find an alternative solution.

Last edited by Hemanshu Kumar; 10 Jul 2023, 04:03.

Comment

kamalesh pahurkar

Join Date: Nov 2022

Posts: 32
#6

10 Jul 2023, 05:25

Hi Hemanshu,Thannks for your response. It works. Can you tell me, what will be the change in syntax if I also have a time variable in the above dataset.
Comment

Announcement

identifying a common observation across the groups

Comment

Comment

Comment

Comment

Comment