Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deleting the missing values (not entire observations) of a variable

    Suppose I have the following table:
    ID Preferences
    1 2
    1 .
    1 .
    2 .
    2 3
    3 .
    3 .
    3 4
    4 1
    I have other columns before 'ID' and after 'Preferences'. What I want to get is two columns of the following type on the right of the above table:
    ID Preferences
    1 2
    2 3
    3 4
    4 1
    In other words, I want to delete the missing values of the variable 'Preferences'. I don't want to delete observations. I only want to record 'Preferences' with non-missing values. So
    Code:
    drop Preferences if Preferences != .
    would not work in this case. This would delete observations with missing values for Preferences, which is not what I want.

    How can I do this?
    Last edited by Eric Bell; 14 Nov 2021, 12:48.

  • #2
    Stata is not a spreadsheet, so you cannot think like that. An observation defines entries of the same entity. For example, in the auto dataset

    Code:
    . sysuse auto
    (1978 Automobile Data)
    
    . l in 1
    
         +----------------------------------------------------------------------------------------------------------------+
         | make          price   mpg   rep78   headroom   trunk   weight   length   turn   displa~t   gear_r~o    foreign |
         |----------------------------------------------------------------------------------------------------------------|
      1. | AMC Concord   4,099    22       3        2.5      11    2,930      186     40        121       3.58   Domestic |
         +----------------------------------------------------------------------------------------------------------------+
    the first observation belongs to "AMC Concord". I cannot just delete one entry of this observation and move up entries of other cars to align with it.

    Comment


    • #3
      Interesting thing to know.

      My problem is that I have some columns before the above table. So my actual table looks something like:
      ID1 Num_Females ID Preferences
      1 10 1 2
      2 15 1 .
      3 20 1 .
      4 10 2 .
      5 3 2 3
      6 6 3 .
      7 17 3 .
      8 13 3 4
      9 12 4 1
      So you see that ID1 does not have repetitions. You can think of ID1 as a household and ID as an individual member of a household.

      Ultimately, I want a data that would look like this:

      ID1 Num_Females ID Preferences
      1 10 1 2
      2 15 2 3
      3 20 3 4
      4 10 4 1
      5 3 5 [a number depending on the original data]
      6 6 6 [ditto]
      7 17 7 [ditto]
      8 13 8 [ditto]
      9 12 9 [ditto]

      The first two columns here are left untouched. The third column does not have repetitions, so 'ID' would be household, the same as ID1.

      Would it be possible to get the last two columns in Stata?

      Comment


      • #4
        Assuming no duplicates (or preferences are constant within id):

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte(id1 num_females id preferences)
        1 10 1 2
        2 15 1 .
        3 20 1 .
        4 10 2 .
        5  3 2 3
        6  6 3 .
        7 17 3 .
        8 13 3 4
        9 12 4 1
        end
        
        frame put id preferences, into(id)
        drop id preferences
        frame id{
            rename id id1
            bys id1 (preferences): keep if _n== 1
        }
        frlink 1:1 id1, frame(id)
        frget preferences, from(id)
        frame drop id
        Res.:

        Code:
        . l, sep(0)
        
             +--------------------------------+
             | id1   num_fe~s   id   prefer~s |
             |--------------------------------|
          1. |   1         10    1          2 |
          2. |   2         15    2          3 |
          3. |   3         20    3          4 |
          4. |   4         10    4          1 |
          5. |   5          3    .          . |
          6. |   6          6    .          . |
          7. |   7         17    .          . |
          8. |   8         13    .          . |
          9. |   9         12    .          . |
             +--------------------------------+

        Comment


        • #5
          This perhaps builds on your previous discussion at

          https://www.statalist.org/forums/for...for-each-group

          I have followed up on that post.


          Comment

          Working...
          X