Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Filling up missing values in master dataset using a second dataset

    Dear all,

    I am using Stata 16.

    The problem I am facing is the following. I have two datasets (example below) with the exact same variables, the timeframe of the master dataset is 1993-2020 and the timeframe of the second is 1993-2013. What I wish to do is to use the second dataset to fill up missing values in the first.

    example of master dataset:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str14 country int years long emg
    "Austria"  1993      .
    "Austria"  1994      .
    "Austria"  1995      .
    "Austria"  1996  66050
    "Austria"  1997  68585
    "Austria"  1998  64272
    "Austria"  1999  66923
    "Austria"  2000  64472
    "Austria"  2001  72654
    "Austria"  2002  74831
    "Austria"  2003  71996
    "Austria"  2004  71721
    "Austria"  2005  70133
    "Austria"  2006  74432
    "Austria"  2007  49898
    "Austria"  2008  51563
    "Austria"  2009  53244
    "Austria"  2010  51651
    "Austria"  2011  51197
    "Austria"  2012  51812
    "Austria"  2013  54071
    "Austria"  2014  53491
    "Austria"  2015  56689
    "Austria"  2016  64428
    "Austria"  2017  66144
    "Austria"  2018  67212
    "Austria"  2019  68280
    "Austria"  2020  62581
    "Belgium"  1993  44811
    "Belgium"  1994  36530
    "Belgium"  1995  36044
    "Belgium"  1996  36674
    "Belgium"  1997  39320
    "Belgium"  1998  72087
    "Belgium"  1999  74097
    "Belgium"  2000  75320
    "Belgium"  2001  75261
    "Belgium"  2002  75960
    "Belgium"  2003  79399
    "Belgium"  2004  83895
    "Belgium"  2005  86899
    "Belgium"  2006  88163
    "Belgium"  2007  91052
    "Belgium"  2008      .
    "Belgium"  2009      .
    "Belgium"  2010  66013
    "Belgium"  2011  84148
    "Belgium"  2012  93600
    "Belgium"  2013 102657
    "Belgium"  2014  94573
    "Belgium"  2015  89794
    "Belgium"  2016  92471
    "Belgium"  2017  89690
    "Belgium"  2018  88935
    "Belgium"  2019 102936
    "Belgium"  2020  76562
    example of second dataset:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str52 country int years long emg
    "Austria"  1993      .
    "Austria"  1994      .
    "Austria"  1995      .
    "Austria"  1996      .
    "Austria"  1997      .
    "Austria"  1998  64272
    "Austria"  1999  66923
    "Austria"  2000  64472
    "Austria"  2001      .
    "Austria"  2002  74831
    "Austria"  2003  71996
    "Austria"  2004  71721
    "Austria"  2005  70133
    "Austria"  2006  74432
    "Austria"  2007  49898
    "Austria"  2008  51563
    "Austria"  2009  53244
    "Austria"  2010  51651
    "Austria"  2011  51197
    "Austria"  2012  51812
    "Austria"  2013  54071
    "Belgium"  1993      .
    "Belgium"  1994      .
    "Belgium"  1995      .
    "Belgium"  1996      .
    "Belgium"  1997      .
    "Belgium"  1998  72087
    "Belgium"  1999  74097
    "Belgium"  2000  75320
    "Belgium"  2001  75261
    "Belgium"  2002  75960
    "Belgium"  2003  79399
    "Belgium"  2004  83895
    "Belgium"  2005  86899
    "Belgium"  2006  88163
    "Belgium"  2007  91052
    "Belgium"  2008 100275
    "Belgium"  2009      .
    "Belgium"  2010  66013
    "Belgium"  2011  67475
    "Belgium"  2012  74720
    "Belgium"  2013  90800
    The idea is that the second dataset sometimes contains values for emg where the master dataset has missing values, and I want to combine them to have the most complete dataset possible. In the data example you can for instance see that the master dataset has a missing value for "Belgium" 2008 while the second dataset does have a value for "Belgium" 2008.

    Thanks in advance!

    Kind Regards,
    Gert Bolle

  • #2
    this is what the "update" option of -merge- is for; try:
    Code:
    merge 1:1 country years using seconddata, update
    replace "seconddata" with the name of your second data set

    see
    Code:
    h merge

    Comment


    • #3
      Thank you for the fast reply!

      I had the feeling there was a simple solution, but I must have overlooked it. Apologies for the elementary question

      Comment


      • #4
        no apology needed - we all miss things at times

        Comment

        Working...
        X