Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge Difficulties

    Hello,

    I'm having difficulty merging together two data sets that contain population estimates for metropolitan areas. What I want to do is combine these two into a panel. Both sets have the same FIPS codes ( the "metarea" variable in the datasets) and string variable for the metropolitan area's name ("metro"). When I run the merge
    Code:
     merge 1:1 year metarea metro using "C:\Users\ageis\Documents\Research\Managerial Data\IPUMS\1990.dta"
    none of the observations match despite what I should expect: both have the same number of observations and the same identifying code "metarea."

    The first data set or master file looks like

    Code:
    * Example generated by -dataex-. 
    clear
    input float year int metarea str39 metro double population
    1980   4 "Abilene, TX"                              111420
    1980   8 "Akron, OH"                                663780
    1980  16 "Albany, NY"                               740740
    1980  20 "Albuquerque, NM"                          456780
    1980  22 "Alexandria, LA"                           135880
    1980  24 "Allentown, PA"                            501300
    1980  28 "Altoona, PA"                              134620
    1980  32 "Amarillo, TX"                             172560
    1980  38 "Anchorage, AK"                            175600
    1980  44 "Ann Arbor, MI"                            266940
    1980  45 "Anniston, AL"                             120400
    1980  46 "Appleton, WI"                             291660
    1980  52 "Atlanta, GA"                             1781240
    1980  56 "Atlantic City, NJ"                        194040
    1980  60 "Augusta, GA"                              289920
    1980  64 "Austin, TX"                               537560
    1980  68 "Bakersfield, CA"                          402320
    1980  72 "Baltimore, MD"                           2171980
    1980  76 "Baton Rouge, LA"                          476640
    1980  84 "Beaumont, TX"                             374120
    1980  86 "Bellingham, WA"                           106340
    1980  87 "Benton Harbor, MI"                        174360
    1980  88 "Billings, MT"                             107660
    1980  92 "Biloxi-Gulfport, MS"                      158460
    1980  96 "Binghamton, NY"                           213660
    1980 100 "Birmingham, AL"                           778940
    1980 104 "Bloomington, IL"                          118820
    1980 108 "Boise, ID"                                173480
    1980 112 "Boston, MA"                              2588460
    1980 115 "Bremerton, WA"                            146880
    1980 116 "Bridgeport, CT"                           251280
    1980 124 "Brownsville, TX"                          212720
    1980 128 "Buffalo, NY"                             1239820
    1980 132 "Canton, OH"                               377720
    1980 136 "Cedar Rapids, IA"                         170720
    1980 140 "Champaign-Urbana-Rantoul, IL"             168900
    1980 144 "Charleston, SC"                           431000
    1980 152 "Charlotte, NC"                            748260
    1980 156 "Chattanooga, TN"                          392200
    1980 160 "Chicago"                                 7775160
    1980 162 "Chico, CA"                                143480
    1980 164 "Cincinnati, OH"                          1239180
    1980 168 "Cleveland, OH"                           1899200
    1980 172 "Colorado Springs, CO"                     312640
    1980 174 "Columbia, MO"                              99940
    1980 176 "Columbia, SC"                             408200
    1980 184 "Columbus, OH"                            1094920
    1980 188 "Corpus Christi, TX"                       271280
    1980 192 "Dallas-Fort Worth, TX"                   2975260
    1980 195 "Daville, VA"                              112160
    1980 196 "Davenport, IA"                            327680
    1980 200 "Dayton-Springfield, OH"                  1014740
    1980 202 "Daytona Beach, FL"                        258720
    1980 204 "Decatur, IL"                              130860
    1980 208 "Denver-Boulder, CO"                      1593660
    1980 212 "Des Moines, IA"                           303180
    1980 216 "Detroit"                                 4286580
    1980 224 "Duluth, MN"                               223280
    1980 229 "Eau Claire, WI"                           130960
    1980 231 "El Paso"                                  482640
    1980 232 "Elkhart-Goshen, IN"                       136940
    1980 236 "Erie, PA"                                 279720
    1980 240 "Eugene, OR"                               275060
    1980 256 "Fayetteville, NC"                         249620
    1980 258 "Fayetteville, AR"                          99660
    1980 264 "Flint, MI"                                451880
    1980 267 "Fort Collins, CO"                         149100
    1980 268 "Fort Lauderdale, FL"                     1018340
    1980 270 "Fort Myers-Cape Coral, FL"                206980
    1980 276 "Fort Wayne, IN"                           294260
    1980 284 "Fresno"                                   515540
    1980 290 "Gainesville, FL"                          151740
    1980 292 "Galveston, TX"                            196260
    1980 300 "Grand Rapids, MI"                         599780
    1980 306 "Greeley, CO"                              120940
    1980 308 "Green Bay, WI"                            176080
    1980 312 "Greensboro-Winston Salem-High Point, NC"  675060
    1980 316 "Greenville-Spartenburg-Andersen, SC"      624500
    1980 318 "Hagerston, MD"                            113300
    1980 320 "Hamilton, OH"                             258320
    1980 324 "Harrisburg, PA"                           446400
    1980 328 "Hartford, CT"                             654880
    1980 329 "Hickory, NC"                              105500
    1980 332 "Honolulu, HI"                             767760
    1980 336 "Houston, TX"                             2744040
    1980 348 "Indianapolis, IN"                        1167200
    1980 352 "Jackson, MI"                              150020
    1980 356 "Jackson, MS"                              252880
    1980 359 "Jacksonville, FL"                         689120
    1980 360 "Jacksonville, NC"                         113660
    1980 362 "Janesville, WI"                           138960
    1980 366 "Johnson City, TN"                         142940
    1980 368 "Johnstown, PA"                            184000
    1980 371 "Joplin, MO"                               127380
    1980 372 "Kalamazoo, MI"                            211620
    1980 376 "Kansas City"                             1330200
    1980 380 "Kenosha, WI"                              123100
    1980 381 "Kileen-Temple, TX"                        217580
    1980 384 "Knoxville, TN"                            475920
    1980 388 "Lafayette, LA"                            151680
    end
    and the second looks like
    Code:
    * Example generated by -dataex-. 
    input float year int metarea str39 metro double population
    1990   4 "Abilene, TX"                              120493
    1990   8 "Akron, OH"                                656736
    1990  16 "Albany, NY"                               778477
    1990  20 "Albuquerque, NM"                          479994
    1990  22 "Alexandria, LA"                           131619
    1990  24 "Allentown, PA"                            540986
    1990  28 "Altoona, PA"                              129438
    1990  32 "Amarillo, TX"                             185058
    1990  38 "Anchorage, AK"                            229284
    1990  44 "Ann Arbor, MI"                            285984
    1990  45 "Anniston, AL"                             116087
    1990  46 "Appleton, WI"                             315840
    1990  52 "Atlanta, GA"                             2423588
    1990  56 "Atlantic City, NJ"                        223047
    1990  60 "Augusta, GA"                              395821
    1990  64 "Austin, TX"                               716419
    1990  68 "Bakersfield, CA"                          541891
    1990  72 "Baltimore, MD"                           2344910
    1990  76 "Baton Rouge, LA"                          511194
    1990  84 "Beaumont, TX"                             360990
    1990  86 "Bellingham, WA"                           127473
    1990  87 "Benton Harbor, MI"                        162709
    1990  88 "Billings, MT"                             112309
    1990  92 "Biloxi-Gulfport, MS"                      198258
    1990  96 "Binghamton, NY"                           213173
    1990 100 "Birmingham, AL"                           802311
    1990 104 "Bloomington, IL"                          130194
    1990 108 "Boise, ID"                                204164
    1990 112 "Boston, MA"                              3322865
    1990 115 "Bremerton, WA"                            187875
    1990 116 "Bridgeport, CT"                           256380
    1990 124 "Brownsville, TX"                          260248
    1990 128 "Buffalo, NY"                             1185626
    1990 132 "Canton, OH"                               363917
    1990 136 "Cedar Rapids, IA"                         169360
    1990 140 "Champaign-Urbana-Rantoul, IL"             172344
    1990 144 "Charleston, SC"                           504729
    1990 152 "Charlotte, NC"                           1023861
    1990 156 "Chattanooga, TN"                          395661
    1990 160 "Chicago"                                 7872098
    1990 162 "Chico, CA"                                183044
    1990 164 "Cincinnati, OH"                          1269294
    1990 168 "Cleveland, OH"                           1746883
    1990 172 "Colorado Springs, CO"                     394239
    1990 174 "Columbia, MO"                             113830
    1990 176 "Columbia, SC"                             453195
    1990 184 "Columbus, OH"                            1326013
    1990 188 "Corpus Christi, TX"                       291267
    1990 192 "Dallas-Fort Worth, TX"                   3719803
    1990 195 "Daville, VA"                              105966
    1990 196 "Davenport, IA"                            299802
    1990 200 "Dayton-Springfield, OH"                   857036
    1990 202 "Daytona Beach, FL"                        370603
    1990 204 "Decatur, IL"                              118368
    1990 208 "Denver-Boulder, CO"                      1782405
    1990 212 "Des Moines, IA"                           322090
    1990 216 "Detroit"                                 4299540
    1990 224 "Duluth, MN"                               198048
    1990 229 "Eau Claire, WI"                           139108
    1990 231 "El Paso"                                  588226
    1990 232 "Elkhart-Goshen, IN"                       156866
    1990 236 "Erie, PA"                                 276466
    1990 240 "Eugene, OR"                               285194
    1990 256 "Fayetteville, NC"                         273828
    1990 258 "Fayetteville, AR"                         112299
    1990 264 "Flint, MI"                                430338
    1990 267 "Fort Collins, CO"                         185727
    1990 268 "Fort Lauderdale, FL"                     1255555
    1990 270 "Fort Myers-Cape Coral, FL"                335345
    1990 276 "Fort Wayne, IN"                           303625
    1990 284 "Fresno"                                   669453
    1990 290 "Gainesville, FL"                          180606
    1990 292 "Galveston, TX"                            217638
    1990 300 "Grand Rapids, MI"                         689548
    1990 306 "Greeley, CO"                              129531
    1990 308 "Green Bay, WI"                            196267
    1990 312 "Greensboro-Winston Salem-High Point, NC"  848427
    1990 316 "Greenville-Spartenburg-Andersen, SC"      694875
    1990 318 "Hagerston, MD"                            120535
    1990 320 "Hamilton, OH"                             292121
    1990 324 "Harrisburg, PA"                           585716
    1990 328 "Hartford, CT"                             604473
    1990 329 "Hickory, NC"                              118539
    1990 332 "Honolulu, HI"                             831642
    1990 336 "Houston, TX"                             3402893
    1990 348 "Indianapolis, IN"                        1250275
    1990 352 "Jackson, MI"                              151312
    1990 356 "Jackson, MS"                              252807
    1990 359 "Jacksonville, FL"                         670766
    1990 360 "Jacksonville, NC"                         148374
    1990 362 "Janesville, WI"                           141346
    1990 366 "Johnson City, TN"                         284994
    1990 368 "Johnstown, PA"                            239012
    1990 371 "Joplin, MO"                               133881
    1990 372 "Kalamazoo, MI"                            220183
    1990 376 "Kansas City"                             1475121
    1990 380 "Kenosha, WI"                              129567
    1990 381 "Kileen-Temple, TX"                        189601
    1990 384 "Knoxville, TN"                            601743
    1990 388 "Lafayette, LA"                            165423
    end
    I'm not sure of the reason for this. Thanks for any help on this!

  • #2
    The problem is that you are telling the merge command that observations must match on year, metarea, and metro. But in your master dataset the year is 1980 and in your using dataset the year is 1990. Nothing matches.

    You state that you want a panel. In Stata that would be two observations per metropolitan area, one from 1980 and one from 1990, and suggests using append rather than merge.

    If you really want one observation per metropolitan area, I would start with the append and then reshape wide to transform your data into what you want.

    For details about panel data in Stata, see the Stata Longitudinal-Data/Panel-Data Reference Manual PDF included in your Stata installation and accessible from Stata's Help menu.

    Comment


    • #3
      No, they don't match at all. The first data set contains only observations from 1980, and the second contains only observations from 1990. But your -merge- command wants the data sets to also agree on year.

      It's not clear to me what result you actually hope to get. Let's imagine for the moment that both data sets were 1980 data and the -merge- went through. The two data sets have different values for population. So do you want to keep the one from data set 1, or the one from data set 2? You don't say, and I don't get the sense that you have even thought about the issue.

      Which leads me to think that maybe what you really want to do is combine these so that for each metropolitan area you have two separate observations: one for 1980 and the other for 1980. In that case, it isn't a -merge- you need, it's an -append-.

      Added: Crossed with #2, which raises the same issues.

      Comment


      • #4
        Thank you William and Clyde for your responses.

        I realize now that I was foolishly trying to merge when I ought to have used append. And yes, what I wanted to do was combine the two so that I could have separate observations per each metropolitan area by time or in the sense of what I want the years 1980 and 1990.

        Comment

        Working...
        X