Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing values when merge

    Hi,
    I have a dataset similar to the following table where I want to calculate the mean values of x1, x2, and x3 by each id. I'm using the following code to get the average value for x1.

    Code:
    collapse avg_x1 = x1, by(id)
    save mean_data, replace
    However, when I merge them to the original dataset (the following table), I get missing values for some id's (id = 2 in this dataset). I'm using the following code to merge them:

    Code:
    use original_dataset, clear
    sort id t
    merge m:1 id using mean_data
    the "_merge" variable corresponding to id=2 says "master only (1)", while its value corresponding to id=1 is "matched (3)".

    I would appreciate any insights.

    I also appreciate it if anyone can also address how I can get all x1, x2, and x3 averages by id, without needing to use -preserve-, -restore-, and -save- command for each variable separately.

    Thanks,
    .
    id x1 x2 x3 t
    1 210 1 1 1
    1 0 0 1 2
    1 465 3 1 3
    1 948 3 1 4
    1 0 0 1 5
    1 1519 2 1 6
    2 0 0 1 1
    2 0 0 1 2
    2 0 0 1 3
    2 475 2 1 4
    2 145 1 1 5
    2 215 1 1 6
    Last edited by Sa Fe; 19 Nov 2021, 16:54.

  • #2
    Use dataex to provide some data so we can replicate your result (see the FAQ). You can also find the mean without collapsing and merging with egen.

    Code:
    egen avg_x1 = mean(x1), by(id)

    Comment


    • #3
      This seems to me to be what you were doing, but it works for me.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte id int x1 byte(x2 x3 t)
      1  210 1 1 1
      1    0 0 1 2
      1  465 3 1 3
      1  948 3 1 4
      1    0 0 1 5
      1 1519 2 1 6
      2    0 0 1 1
      2    0 0 1 2
      2    0 0 1 3
      2  475 2 1 4
      2  145 1 1 5
      2  215 1 1 6
      end
      save original_data, replace
      
      collapse (mean) x1-x3, by(id)
      rename (x1-x3) (avg_=)
      save mean_data, replace
      
      use original_data, clear
      merge m:1 id using mean_data
      list, sepby(id)
      Code:
      . list, sepby(id)
      
           +--------------------------------------------------------------------+
           | id     x1   x2   x3   t    avg_x1    avg_x2   avg_x3        _merge |
           |--------------------------------------------------------------------|
        1. |  1    210    1    1   1   523.667       1.5        1   Matched (3) |
        2. |  1      0    0    1   2   523.667       1.5        1   Matched (3) |
        3. |  1    465    3    1   3   523.667       1.5        1   Matched (3) |
        4. |  1    948    3    1   4   523.667       1.5        1   Matched (3) |
        5. |  1      0    0    1   5   523.667       1.5        1   Matched (3) |
        6. |  1   1519    2    1   6   523.667       1.5        1   Matched (3) |
           |--------------------------------------------------------------------|
        7. |  2      0    0    1   1   139.167   .666667        1   Matched (3) |
        8. |  2      0    0    1   2   139.167   .666667        1   Matched (3) |
        9. |  2      0    0    1   3   139.167   .666667        1   Matched (3) |
       10. |  2    475    2    1   4   139.167   .666667        1   Matched (3) |
       11. |  2    145    1    1   5   139.167   .666667        1   Matched (3) |
       12. |  2    215    1    1   6   139.167   .666667        1   Matched (3) |
           +--------------------------------------------------------------------+

      Comment

      Working...
      X