Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matchit then Merge help

    Good day. I hope folks are staying safe.

    I am working on merging 1:1 two data sets both with county-level data for the years 2012-2016. They both have a string and numeric variable I created called n_county that consists of state name and county name, however in dataset a (master dataset), the observations are lowercase, and in the using data set b, the variable observations are a mix of caps and lowercase. When I tried to merge on n_county and year, it did not work and instead there were no matches and the variables were appended and increase the number of variables and observations. My next step was to create a unique string variable in each to use the matchit command. I provided below, my matchit code, my index results, the dataex for the matchit, master file, and using file. I am a bit stumped.

    Code:
    matchit id1 n_county1 using "dataset b", idu(id2) txtu(n_county2) sim(token) t(0) //name of using dataset is just for example
    The following results appeared, which I am still trying to figure out what it means.

    Code:
    Indexing USING file.
    0%
    20%
    40%
    60%
    80%
    Done!
    Computing results
            Percent completed ...   (search space saved by index so far)
            20%               ...   (100%)
            40%               ...   (100%)
            60%               ...   (100%)
            80%               ...   (100%)
            Done!
    Total search space saved by index: 100%
    I then tried to merge following the instructions from the -help matchit- but it did not work. I keep getting the following error message:


    Code:
    . merge m:1 id1 n_county1 using "/Volumes/GoogleDrive/My Drive/Data/dissertation/pt5_ahrf18_merge/ahrf18_mergecounty.dta", generat
    > e (_mergecty)
    variables id1 n_county1 do not uniquely identify observations in the using data
    r(459);
    Here is the dataex of the 5 variables from the dataset created by matchit
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte similscore str4(n_county2 n_county1) int(id2 id1)
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "1" "1" 1 1
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "2" "2" 2 2
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "3" "3" 3 3
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    1 "4" "4" 4 4
    end
    dataex for the master file

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float n_county str4 n_county1 float id1 str20 statename str25 countyname long fips_code str5 mod_fips byte fipsstatecode int fipscountycode byte year
    1 "1" 1 "Alabama" "Autauga" 1001 "01001" 1 1 15
    1 "1" 1 "Alabama" "Autauga" 1001 "01001" 1 1 14
    1 "1" 1 "Alabama" "Autauga" 1001 "01001" 1 1 12
    1 "1" 1 "Alabama" "Autauga" 1001 "01001" 1 1 16
    1 "1" 1 "Alabama" "Autauga" 1001 "01001" 1 1 13
    2 "2" 2 "Alabama" "Baldwin" 1003 "01003" 1 3 15
    2 "2" 2 "Alabama" "Baldwin" 1003 "01003" 1 3 13
    2 "2" 2 "Alabama" "Baldwin" 1003 "01003" 1 3 16
    2 "2" 2 "Alabama" "Baldwin" 1003 "01003" 1 3 12
    2 "2" 2 "Alabama" "Baldwin" 1003 "01003" 1 3 14
    3 "3" 3 "Alabama" "Barbour" 1005 "01005" 1 5 12
    3 "3" 3 "Alabama" "Barbour" 1005 "01005" 1 5 14
    3 "3" 3 "Alabama" "Barbour" 1005 "01005" 1 5 16
    3 "3" 3 "Alabama" "Barbour" 1005 "01005" 1 5 13
    3 "3" 3 "Alabama" "Barbour" 1005 "01005" 1 5 15
    4 "4" 4 "Alabama" "Bibb"    1007 "01007" 1 7 13
    4 "4" 4 "Alabama" "Bibb"    1007 "01007" 1 7 14
    4 "4" 4 "Alabama" "Bibb"    1007 "01007" 1 7 15
    4 "4" 4 "Alabama" "Bibb"    1007 "01007" 1 7 16
    4 "4" 4 "Alabama" "Bibb"    1007 "01007" 1 7 12
    5 "5" 5 "Alabama" "Blount"  1009 "01009" 1 9 15
    5 "5" 5 "Alabama" "Blount"  1009 "01009" 1 9 12
    5 "5" 5 "Alabama" "Blount"  1009 "01009" 1 9 16
    5 "5" 5 "Alabama" "Blount"  1009 "01009" 1 9 13
    5 "5" 5 "Alabama" "Blount"  1009 "01009" 1 9 14
    end
    label values n_county n_county
    label def n_county 1 "Alabama Autauga", modify
    label def n_county 2 "Alabama Baldwin", modify
    label def n_county 3 "Alabama Barbour", modify
    label def n_county 4 "Alabama Bibb", modify
    label def n_county 5 "Alabama Blount", modify
    label values id1 id1
    label def id1 1 "Alabama Autauga", modify
    label def id1 2 "Alabama Baldwin", modify
    label def id1 3 "Alabama Barbour", modify
    label def id1 4 "Alabama Bibb", modify
    label def id1 5 "Alabama Blount", modify
    variables for the using file

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float n_county str4 n_county2 float id2 str20 statename str43 countyname long fips_code byte fipsstatecode int(fipscountycode year)
    1 "1" 1 "ALABAMA" "Autauga County" 1001 1 1 2013
    1 "1" 1 "ALABAMA" "Autauga County" 1001 1 1 2014
    1 "1" 1 "ALABAMA" "Autauga County" 1001 1 1 2016
    1 "1" 1 "ALABAMA" "Autauga County" 1001 1 1 2015
    1 "1" 1 "ALABAMA" "Autauga County" 1001 1 1 2012
    2 "2" 2 "ALABAMA" "Baldwin County" 1003 1 3 2014
    2 "2" 2 "ALABAMA" "Baldwin County" 1003 1 3 2012
    2 "2" 2 "ALABAMA" "Baldwin County" 1003 1 3 2016
    2 "2" 2 "ALABAMA" "Baldwin County" 1003 1 3 2015
    2 "2" 2 "ALABAMA" "Baldwin County" 1003 1 3 2013
    3 "3" 3 "ALABAMA" "Barbour County" 1005 1 5 2015
    3 "3" 3 "ALABAMA" "Barbour County" 1005 1 5 2012
    3 "3" 3 "ALABAMA" "Barbour County" 1005 1 5 2016
    3 "3" 3 "ALABAMA" "Barbour County" 1005 1 5 2013
    3 "3" 3 "ALABAMA" "Barbour County" 1005 1 5 2014
    4 "4" 4 "ALABAMA" "Bibb County"    1007 1 7 2014
    4 "4" 4 "ALABAMA" "Bibb County"    1007 1 7 2013
    4 "4" 4 "ALABAMA" "Bibb County"    1007 1 7 2015
    4 "4" 4 "ALABAMA" "Bibb County"    1007 1 7 2012
    4 "4" 4 "ALABAMA" "Bibb County"    1007 1 7 2016
    5 "5" 5 "ALABAMA" "Blount County"  1009 1 9 2016
    5 "5" 5 "ALABAMA" "Blount County"  1009 1 9 2012
    5 "5" 5 "ALABAMA" "Blount County"  1009 1 9 2014
    5 "5" 5 "ALABAMA" "Blount County"  1009 1 9 2013
    5 "5" 5 "ALABAMA" "Blount County"  1009 1 9 2015
    end
    label values n_county n_county
    label def n_county 1 "ALABAMA Autauga County", modify
    label def n_county 2 "ALABAMA Baldwin County", modify
    label def n_county 3 "ALABAMA Barbour County", modify
    label def n_county 4 "ALABAMA Bibb County", modify
    label def n_county 5 "ALABAMA Blount County", modify
    label values id2 id2
    label def id2 1 "ALABAMA Autauga County", modify
    label def id2 2 "ALABAMA Baldwin County", modify
    label def id2 3 "ALABAMA Barbour County", modify
    label def id2 4 "ALABAMA Bibb County", modify
    label def id2 5 "ALABAMA Blount County", modify

    Any help is appreciated.

    Rene
    using Stata 12.1 on Mac OS Catalina 10.15.4

  • #2
    I am hoping clarifying my question would assist in getting any feedback on this inquiry. I am trying to merge two county level panel data sets. I tried merging on year and unique variable I created n_county (group(state name and county name)), however the error message is that those vars do not accurately identify unique values in the datasets. Even after trying to change the case, I realized the case sensitivity and an additional word that caused them not to match. It seemed quicker to use the matchit command. However I am still having trouble with "unique" variables.

    Any help please would be appreciated.

    Rene
    using Stata 12.1 on Mac OS Catalina 10.15.4

    Comment


    • #3
      Try to use m:m merge.

      Code:
      merge m:m id1 n_county1 using "/Volumes/GoogleDrive/My Drive/Data/dissertation/pt5_ahrf18_merge/ahrf18_mergecounty.dta", generate (_mergecty)

      Comment

      Working...
      X