Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop observations belonging to a company if variable not same value over the years

    Dear users,

    I'm new to Stata and been trying for a while to filter/clean my dataset.
    So I have observations over multiple years for each company. Now I want to drop the company with all the year observations if my variable big6 does not remain the same (either 0 for all the years or 1 for all the years at a company).
    In below example I would keep '0022484' because big6 remains 1 for all the years, however I have to drop '009441' because it changed from 0 to 1 over the years.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6 gvkey double fyear float big6
    "002484" 1993 1
    "002484" 1991 1
    "002484" 1992 1
    "002484" 1990 1
    "002484" 1988 1
    "002484" 1989 1
    "009441" 1989 0
    "009441" 1993 1
    "009441" 1990 1
    "009441" 1992 1
    end

    (Also on a sidenote, I've been able to sort the data based on gvkey, but I have not been able to sort it chronologically on fyear as well while keeping the sorting based on gvkey).

    If anyone can help me out, I would really appreciate it!
    Thank you!

  • #2
    This is an FAQ. See https://www.stata.com/support/faqs/d...ions-in-group/

    Comment


    • #3
      I'm sorry I am still really new to Stata. But thanks a lot! This helped me solve it.

      Comment


      • #4
        Daniel:
        the -egen- function -mean- can be another ption in addition to Andrew's helpful link:

        Code:
        . bysort gvkey (fyear): egen wanted=mean( big6)
        
        . list
        
             +--------------------------------+
             |  gvkey   fyear   big6   wanted |
             |--------------------------------|
          1. | 002484    1988      1        1 |
          2. | 002484    1989      1        1 |
          3. | 002484    1990      1        1 |
          4. | 002484    1991      1        1 |
          5. | 002484    1992      1        1 |
             |--------------------------------|
          6. | 002484    1993      1        1 |
          7. | 009441    1989      0      .75 |
          8. | 009441    1990      1      .75 |
          9. | 009441    1992      1      .75 |
         10. | 009441    1993      1      .75 |
             +--------------------------------+
        
        . drop if big6!=wanted
        (4 observations deleted)
        
        . list
        
             +--------------------------------+
             |  gvkey   fyear   big6   wanted |
             |--------------------------------|
          1. | 002484    1988      1        1 |
          2. | 002484    1989      1        1 |
          3. | 002484    1990      1        1 |
          4. | 002484    1991      1        1 |
          5. | 002484    1992      1        1 |
             |--------------------------------|
          6. | 002484    1993      1        1 |
             +--------------------------------+
        
        .
        
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Daniel Chang View Post
          I'm sorry I am still really new to Stata. But thanks a lot! This helped me solve it.
          No need to apologize, we all started somewhere. The statement was not meant as criticism on your part, but informational.

          Comment

          Working...
          X