Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generate individual identifier with double variables

    Dear all,
    I have those two variables for household identifier. They identify between hh (hrhhid) and within hh (hrhhid2). By combininig them I can obtain an individual id. I try to tostring hrhhid and then sum the two, and grouping over their sum, but this does not work. Any hints?
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double hrhhid str5 hrhhid2
    100001005030231 "78001"
    100001005030231 "78001"
    100007103071001 "76001"
    100007103071001 "76001"
    100010166540850 "80001"
    100011405001231 "81001"
    100031900039771 "78001"
    100031900039771 "78001"
    100031900039771 "78001"
    100031900039771 "78001"
     10010483970999 "78001"
     10013460980999 "81001"
     10013460980999 "81001"
     10013460980999 "81001"
     10013460980999 "81001"
     10013460980999 "81001"
    100194004012700 "78001"
    100194004012700 "78001"
    100194004012700 "78001"
    100206982337993 "80001"
    100207936359659 "80001"
      1003363106467 "81001"
      1003363106467 "81001"
      1003363106467 "81001"
      1003363106467 "81001"
    100349520129103 "81001"
    100349520129103 "81001"
    100349520129103 "81001"
    100349520129103 "81001"
    100370720770402 "80001"
    100370720770402 "80001"
    100409977507504 "77001"
    100409977507504 "77001"
    100409977507504 "77001"
    100409973562763 "78001"
    100409973562763 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100500003091916 "78001"
    100506982329598 "78001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600005887119 "79001"
    100600096585519 "80001"
    100609464814174 "77001"
    100609464814174 "77001"
    100609464814174 "77001"
    100609965629688 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609965629688 "79001"
    100609905849089 "79001"
    100609905849089 "79001"
    100609965629688 "79001"
    100609965629688 "79001"
    100609976208236 "79002"
    100609963211985 "80001"
    100609916635560 "80001"
    100609949970289 "80001"
    100609974034637 "80001"
    100609949970289 "80001"
    100609963211985 "80001"
    100609974034637 "80001"
    100609963211985 "80001"
    100609974197484 "80001"
    100609974197484 "80001"
    100609974034637 "80001"
    100609963211985 "80001"
    100609957734080 "80001"
    100609957734080 "80001"
    100609916635560 "80001"
    100609916635560 "80001"
    100609916635560 "80001"
    100609912942030 "80001"
    100609949970289 "80001"
    100609916635560 "80001"
    100609974197484 "80001"
    100609974034637 "80001"
    100609916635560 "80001"
    100609912942030 "80001"
    100609974034637 "80001"
    100609974034637 "80001"
    end

  • #2
    See https://journals.sagepub.com/doi/pdf...867X0800700407 for a short paper on precisely this problem.

    Comment


    • #3
      I think there is a problem of precision because the variable I obtained is equal also for observations that clearly differ in one of the two

      Comment


      • #4
        #3 isn''t substantiated by any data example or a report of the exact code you used. It's possible that if you have a very large number of combinations of your different identifiers, then you need to consider specifying long or double as a variable type.

        At the same time I note that your first identifier has up to 15 digits while we all should be confident that you do not have a quadrillion households, so you don't need all those details. So, a multistep solution such as

        Code:
        egen long work = group(hrhhid)
        egen long wanted = group(work hrhhid2) 
        compress wanted
        should deal with precision problems. We can't tell if either of your identifiers is informative in any sense, but if so that is a separate question.

        Comment

        Working...
        X