Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • identifying a common observation across the groups

    How can I identify the common observations across the groups. I have a group codes under which there are different codes. I want to identify the codes that are common across the groups.

  • #2
    Kamalesh:
    If I got you right, you may want to consider something along the following lines:
    Code:
    . use "https://www.stata-press.com/data/r17/nlswork.dta"
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    . egen wanted=count( year) if year==70
    Last edited by Carlo Lazzaro; 08 Jul 2023, 08:55.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Kamalesh,

      This is effective for creating categorical variables for what you describe, even if not terribly elegant. The variable 'group3' identifies variables that are common across the first two groups. Note there are none. I don't wish to speak for Carlo, but he would suggest a more detailed description of your problem is necessary to properly contribute to Statalist and solve your actual problem.
      Code:
      sysuse auto
      gen wtclass=(weight<2700)+2*(weight>=2700 & weight<3500)+3*(weight>=3500)
      gen mpgclass=(mpg<20)+2*(mpg>=20 & mpg<28)+3*(mpg>=28)
      egen group1=group(wtclass foreign)
      egen group2=group(make mpgclass)
      egen group3=group(group1 group2)
      Last edited by Eric Makela; 08 Jul 2023, 15:35. Reason: Definitions

      Comment


      • #4
        Hi look, I have the following kind of data structure. I just want to keep or identify the product codes that are common across all the company code.
        company code product code
        11 115
        11 117
        11 119
        11 112
        12 117
        12 115
        12 119
        16 108
        16 117
        16 119
        16 111
        16 115
        16 120
        14 222
        14 117
        14 115
        14 119

        Comment


        • #5
          Perhaps this?

          Code:
          clear
          input int(company_code product_code)
          11    115
          11    117
          11    119
          11    112
          12    117
          12    115
          12    119
          16    108
          16    117
          16    119
          16    111
          16    115
          16    120
          14    222
          14    117
          14    115
          14    119
          end
          
          egen int num_companies = count(company_code), by(product_code)
          levelsof company_code, local(companies)
          local tot_companies: word count `companies'
          
          gen byte common_to_all = (num_companies == `tot_companies')
          Then you can do things like
          Code:
          . tab product_code if common_to_all
          
          product_cod |
                    e |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                  115 |          4       33.33       33.33
                  117 |          4       33.33       66.67
                  119 |          4       33.33      100.00
          ------------+-----------------------------------
                Total |         12      100.00
          This code assumes that each product_code appears only once for a given company_code. This is the case in your example, but if it is not true in the dataset, let me know and we'll find an alternative solution.
          Last edited by Hemanshu Kumar; 10 Jul 2023, 04:03.

          Comment


          • #6
            Hi Hemanshu,Thannks for your response. It works. Can you tell me, what will be the change in syntax if I also have a time variable in the above dataset.

            Comment

            Working...
            X