Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variables from using dataset missing post merge

    Dear STATAlist,

    I have a merge problem. My master dataset contains data on children, including the ID of the household in which they live (BSIntId). Some households contain more than one child. I have a using dataset that has a line for each household plus household level data - distance to clinic, etc.

    . merge m:1 BSIntId using "arealunits.dta"
    (note: variable BSIntId was int, now long to accommodate using data's values)

    Result # of obs.
    -----------------------------------------
    not matched 15,366
    from master 1 (_merge==1)
    from using 15,365 (_merge==2)

    matched 1,257 (_merge==3)
    -----------------------------------------

    This is what I expect to see - there are 1258 children and one child has missing BSIntId in the master dataset. However, when I look at the 'merged' dataset, it contains none of the variables from the using dataset. Any suggestions?

    Thanks,
    Tom

  • #2
    Could you clarify whether the problem is (a) if you describe after the merge, the variables from the using dataset do not appear in the listing, or (b) if you list after the merge, the variables from the using dataset all contain only missing values. Thanks!

    Comment


    • #3
      Hi William,

      thanks for your message. It is not there either way. Also, I can't see the variables in the variable panel on the right nor if I browse and look directly at dataset.

      Best wishes,
      Tom

      Comment


      • #4
        That's very odd!

        First, are you sure those variables are there in the arealunits.dta file? Try opening that file and verify that the variables are actually present there. Maybe that data set isn't what you think it is.

        If they're really there in arealunits.dta and don't come in with the -merge-, then I would suspect that your Stata installation is somehow corrupted, or perhaps that the .ado files and executable are out of synch. First I would -update all, force- and re-try. If that doesn't solve the problem, I would uninstall Stata and re-install it, -update all- again, and then re-try. And if that doesn't solve it, I would contact Stata technical support, sending them a description of the problem, the code leading up to it, and the offending data sets.

        Comment


        • #5
          One possibility is that there are multiple versions of the using dataset, i.e. with the same name but in different directories. Try

          Code:
          * display the variables in the data in memory
          describe
          
          merge m:1 BSIntId using "arealunits.dta"
          
          describe using "arealunits.dta"
          
          * compare with the variables after the merge
          describe

          Comment


          • #6
            Hi Tom,

            I recently experienced some glitches with Stata related to merging new variables to a master data set based on a common ID. My version of the problem was that Stata would report perfect matching and the variables would be added but all as "missing".
            I tried multiple alternatives and as far as I remember one of the following two solutions worked for me.

            1) Using the update and replace code combined with your merge code
            Code:
            merge m:1 BSIntId using "arealunits.dta", update replace
            2) The using and the master data sets that I was combining were located on a remote shared drive. I copied them to my desktop and successfully completed the merging.

            I hope one of these will work for you
            Patrick



            Comment


            • #7
              Dear all,

              thanks for the advice. I have done the following...

              1. Last night I did 'update all, force' as suggested by Clyde. That hasn't immediately fixed the problem though I wonder if things have changed because...

              2. I did as Robert suggested and...

              - pre-merge I only saw the master data, as expected
              - post merge with 'describe using "arealunits.dta"' and 'describe', both list variables from the using dataset. However, these variables don't appear on the variable panel or when I use 'list', 'tab', etc.

              3. I did as Patrick suggested but it didn't fix the problem...

              '. merge m:1 BSIntId using "arealunits.dta", update replace
              (note: variable BSIntId was int, now long to accommodate using data's values)

              Result # of obs.
              -----------------------------------------
              not matched 15,366
              from master 1 (_merge==1)
              from using 15,365 (_merge==2)

              matched 1,257
              not updated 1,257 (_merge==3)
              missing updated 0 (_merge==4)
              nonmissing conflict 0 (_merge==5)
              -----------------------------------------

              Note I have the datasets on my laptop and am not accessing them via a remote server.

              Why might variables appear with 'describe' but not otherwise be present?

              Best wishes,
              Tom

              Comment


              • #8
                Well, it appears that either your installation of Stata is corrupted, or the dataset arealunits.dta is corrupted, or you have uncovered some kind of bug in Stata.

                What happens in arealunits.dta by itself: -use arealunits, clear-? Do the variables all show up in the variables window? Do -list- and -tab- find them? -describe-? Do you have a friend or colleague with Stata who can try this on a different computer to see if the problem is reproducible in other installations?

                I would still try doing a clean re-install of Stata: uninstall it, re-install it, and then do -update all-. If that doesn't solve the problem, I think you have to take it up with technical support.

                Comment


                • #9
                  One interesting thing is that the results of the merge ..., update replace in step 3 shows that the master file was probably the result of the merge in step 2, because the results for _merge==3 shows that none of the matched data was replaced because it was identical to data that already existed in the master file. This confirms that the merge in step 2 actually added the variables to the master and most likely populated them with the data from arealunits.dta.

                  I agree with Clyde that this problem seems likely to require a reinstallation of Stata. I'd be interested, though, in pushing back to first principles in the FAQ linked to at the top of this page. In particular, we really don't know anything about your installation or your data (FAQ sections 11 and 12). The following would be informative. (But from the output of about, all that matters is the first three lines - delete your license information for your privacy's sake.)

                  Code:
                  about
                  describe using arealunits.dta, short
                  describe using [your master file].dta, short
                  If you don't have large numbers of variables in either file, omitting the short option would give even more information.

                  And if while looking at the FAQ you could review the instructions for posting code and output in code blocks like the one above, that would be helpful.

                  Or just cross your fingers and re-install. Be sure you have your license information handy!

                  Comment

                  Working...
                  X