Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Delete whole row

    I have discovered that I have duplicates for a whole row (same person).. how do I delete the rows without getting missing values?

  • #2
    Code:
    duplicates drop
    If that doesn't do what you want, please post back with a data example.
    Last edited by Ali Atia; 02 Mar 2021, 06:42.

    Comment


    • #3
      If you're using duplicates then duplicates drop is intended to do exactly what you want. Here is a silly example in which a duplicate observation is first created, then detected and then removed.

      Code:
      .  sysuse auto, clear
      (1978 Automobile Data)
      
      . expand 2 in 42
      (1 observation created)
      
      . duplicates list
      
      Duplicates in terms of all variables
      
        +--------------------------------------------------------------------------------------+
        | obs: | make        | price | mpg | rep78 | headroom | trunk | weight | length | turn |
        |   42 | Plym. Arrow | 4,647 |  28 |     3 |      2.0 |    11 |  3,260 |    170 |   37 |
        |----------------------------+---------------------------------------------------------|
        |          displa~t          |          gear_r~o          |           foreign          |
        |               156          |              3.05          |          Domestic          |
        +--------------------------------------------------------------------------------------+
      
        +--------------------------------------------------------------------------------------+
        | obs: | make        | price | mpg | rep78 | headroom | trunk | weight | length | turn |
        |   75 | Plym. Arrow | 4,647 |  28 |     3 |      2.0 |    11 |  3,260 |    170 |   37 |
        |----------------------------+---------------------------------------------------------|
        |          displa~t          |          gear_r~o          |           foreign          |
        |               156          |              3.05          |          Domestic          |
        +--------------------------------------------------------------------------------------+
      
      . duplicates drop
      
      Duplicates in terms of all variables
      
      (1 observation deleted)
      
      . duplicates list
      
      Duplicates in terms of all variables
      
      (0 observations are duplicates)


      Dropping duplicates is what it says. It doesn't create holes in the dataset.
      Last edited by Nick Cox; 02 Mar 2021, 07:02.

      Comment


      • #4
        Not sure if I understand the code... For line 178 and 352 I have the same values for all the variables, i.e the same person. Isnt it possible to mark te cells and delete like in Excel?

        Comment


        • #5
          You can generate a variable which checks if all variables are (not) identical per observation using -egen diff-:

          Code:
          . clear
          
          . set obs 5
          number of observations (_N) was 0, now 5
          
          . set seed 1234
          
          . gen obsno=_n
          
          . forvalues x=1/3{
            2.         gen x`x' = runiformint(1,3)
            3. }
          
          . list,noobs
          
            +----------------------+
            | obsno   x1   x2   x3 |
            |----------------------|
            |     1    1    1    1 |
            |     2    1    1    2 |
            |     3    1    1    1 |
            |     4    1    1    3 |
            |     5    2    3    3 |
            +----------------------+
          
          . egen diff = diff(*)
          
          . drop if !diff
          (1 observation deleted)
          
          . list,noobs
          
            +-----------------------------+
            | obsno   x1   x2   x3   diff |
            |-----------------------------|
            |     2    1    1    2      1 |
            |     3    1    1    1      1 |
            |     4    1    1    3      1 |
            |     5    2    3    3      1 |
            +-----------------------------+

          Comment


          • #6
            Perhaps someone who uses MS Excel can answer your question about that. I don't use it routinely and can't comment.

            Otherwise, the example in #3 is one you can run yourself and look at the data alongside your computations. Otherwise the help and the manual entry for duplicates are there in support and (as the original author) I can attempt any specific questions you may have.

            Comment

            Working...
            X