Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filling Missing Values with duplicates, including strings and numeric variables

    There have been quite a few posts about filling but none with the nature of my problem.

    I have data which has duplicates according to two id variables, call it id1 id2. These duplicates are separate information of the same observations, thus, if we use the command duplicate tabulate and use the id1 id2 and, let's call it id3, the duplicates disappear.

    This last variable, id3, is the source of information. Thus, if the source differs, apart from the id variables, some observations are blank.

    What I need is to merge these duplicates. So if they have the same id1 and id2, then I should fill the blanks with the values from one of the duplicates, no matter which (these are static variables), so then I can drop the remaining duplicates and also id3.

    I cannot post examples of the data.

    I was thinking of using for example max of the group with id1 and id2 to fill, but the only solutions were to use the previous, next, first, last,...

    Moreover the technique of looping over all variables and generating temps using for example 'egen max(...)'does not work as I have string variables, for example, the ids themselves.

  • #2
    Ramiro:
    welcome to this forum.
    The scant number of posts on filling missing data with last, next or other values, is probably due to the fact that filling is not the first choice method to deal with missing data (see -mi-, for instance).
    That said, even if your data are confidential, you can post a fake excerpt/example of your dataset (see -help dataex-): this would help interested listers to help you out in turn. Thanks.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment

    Working...
    X