Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generate grouping variable based on various nominal variables

    Dear community,

    I'm working with household data containing various nominal or ordinal variables such as household type, income group, or location (see data example below)
    I now want to put these households together in different groups, resulting in one group per possible combination of variable levels.
    Is there a way to do this automatically without having to use various loops which would be tedious?

    Thanks a lot,

    Guest

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(hhtyp2 hheink_gr1 RegioStaR7 bus28)
     3 11 77  3
     2 14 73  3
     2 10 77  3
     1  5 77 95
     3  4 73  3
     3  5 71  2
     3  9 72 95
     4  7 71  3
     4  5 77  3
     4  7 74  1
     4  6 72  1
     3  9 72  3
     2  4 72  1
     3  5 76  3
     2  7 72  1
     2  9 73  3
     3  7 75  1
     4  5 74  3
     3  5 76  1
    95  5 76 95
     3  3 73  1
     2  7 76 95
     3  3 76 95
     2  9 74  5
     2  5 71 95
     3  5 77  2
     3  9 73  1
     2  7 77 95
     3  6 74  3
     3  7 72  2
     2  7 73 95
     3  6 72  1
     3  6 77  2
     4  4 71  1
     3  8 73  1
     2  9 76 95
     3  9 73  1
     3  7 71 95
     3 11 73  1
     4  5 73  2
     3  6 73  4
     3  7 73  1
     4  9 76  3
     3 13 71 95
     3  5 77  1
     3  3 76 95
     3  4 77  6
     3  7 76  1
     3 11 77 95
     4  4 72  2
     4  7 72 95
     4  4 72  1
     3 15 76  3
     2 11 74 95
     4  5 73  2
     3  5 77  4
     4  4 73  2
     4  5 77 95
     4  4 77  4
     2  5 76  1
     2  9 73  1
     4  5 73  1
     2  8 73 95
     4  8 73  1
     4  3 71  2
     4  4 77  2
     3 15 72  2
     3  3 77 95
     4  5 73 95
     2 11 73  1
     3 11 73 95
     3  9 77  5
     3  6 73 95
    end

    Last edited by sladmin; 25 May 2022, 12:09. Reason: anonymize original poster

  • #2
    Code:
    egen group = group(hhtyp2 hheink_gr1 RegioStaR7)

    Comment


    • #3
      Something like:

      Code:
      egen test = group(var1 var2 var3)
      tabstat var1 var2 var3, by(test)
      Best wishes

      Stata 18.0 MP | ORCID | Google Scholar

      Comment


      • #4
        Nice, thanks a lot to you two!

        Comment


        • #5
          See also https://www.stata-journal.com/articl...article=dm0034 for more detailed discussion.

          Comment


          • #6
            Thanks a lot!

            Comment

            Working...
            X