Unique identifier for observation

Ashish Bandhu

Join Date: Oct 2020

Posts: 40
#1

Unique identifier for observation

29 Jun 2022, 01:34

Hi everyone,

need help regarding generating a new variable caseid for my data set has variable hhid such that new variable caseid would be like 1, 2, 3 provided variable hhid has some repeated observation.

* Example generated by -dataex-. To install: ssc install dataex
clear
input long hhid
1210103088
1210103088
1210105069
1210105069
1210105070
1210105070
1210109083
1210109083
1210109087
1210109087
1210109090
1210109090
1210109100
1210109100
1210110069
1210110069
1210112079
1210112079
1210203073
1210203076
1210203078
1210205049
1210205049
1210205051
1210205051
1210209070
1210210059
1210210059
1210210065
1210212064
1210212064
1220101001
1220101001
1220101002
1220101002
1220101003
1220101003
1220101007
1220101007
1220102001
1220102001
1220102002
1220102002
1220102004
1220102004
1220103001
1220103001
1220103002
1220103002
1220103003
1220103003
1220104001
1220104001
1220104002
1220104003
1220105001
1220105001
1220105002
1220105002
1220105004
1220106004
1220106005
1220106005
1220106006
1220106006
1220106007
1220106007
1220107001
1220107001
1220107002
1220107003
1220107003
1220107004
1220107004
1220107005
1220107005
1220201019
1220201019
1220201039
1220202014
1220202014
1220204028
1220204031
1220204039
1220204039
1220204041
1220204041
1220206027
1220206031
1220206031
1220206033
1220206033
1220206036
1220208025
1220208025
1220208043
1220212025
1220212025
1220212029
1220212029
end
[/CODE]

The expected output would be

hhid caseid

1210103088 1

1210103088 1

1210105069 2

1210105069 2

1210105070 3

1210105070 3

1210109083 4

1210109083 4

1210109087 5

1210109087 5

1210109090 6

1210109090 6

1210203073 7

The last one has only observation.

Thanks

Ashish
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17853

29 Jun 2022, 01:47

Ashish:
do you mean something along the folowing lines?

Code:

. gen count=1

. collapse (count) count, by(hhid)

. list in 1/10

     +--------------------+
     |       hhid   count |
     |--------------------|
  1. | 1210103088       2 |
  2. | 1210105069       2 |
  3. | 1210105070       2 |
  4. | 1210109083       2 |
  5. | 1210109087       2 |
     |--------------------|
  6. | 1210109090       2 |
  7. | 1210109100       2 |
  8. | 1210110069       2 |
  9. | 1210112079       2 |
 10. | 1210203073       1 |
     +--------------------+

.

It what above is the way to go, save a copy of your orignal dataset before invoking -collapse-.

Kind regards,
Carlo
(Stata 19.0)

Comment

Nick Cox

Join Date: Mar 2014

Posts: 36057
#3

29 Jun 2022, 02:41

See FAQ https://www.stata.com/support/faqs/d...p-identifiers/

Code:

egen caseid = group(hhid)
1 like
Comment
Ashish Bandhu

Join Date: Oct 2020

Posts: 40
#4

29 Jun 2022, 03:00

Thank you both
Comment

Announcement

Unique identifier for observation

Comment

Comment

Comment