Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issues when merging two panel datasets with unequal number of rows

    Good evening to everyone,

    This is my first post on this forum, so I am looking forward to collaborate with the Stata community.

    I am currently encountering a slight problem; I would like to merge two panel data sets, whose structure is the following:

    Master dataset:
    Id t X1
    1 2000 8
    2 2000 2543
    3 2001 24325468
    3 2002 34582467
    3 2003 423568
    8 2005 585858
    10 2010 634648
    and User Dataset:
    Id t X0
    1 2008 1313
    2 2000 5869
    3 2010 89677
    8 2001 8754657
    8 2005 5858
    18 2010 589697979
    Note that some observations (characterized by a couple (Id;t)) of the Master dataset do not appear in the User dataset, and vice versa. My aim is to get Stata to merge the two tables such that every observation, both from the User dataset and from the Master dataset appear in the final table, with the relevant values for x0 and x1 (and a dot if no value is announced for either x0 or x1).

    I have tried to follow the procedure given in the Stata documentation, but the option "merge 1:1 id t using User dataset" does not provide the right output, since the values for the column x_0 are misplaced; I have tried to reiterate the procedure with some small tables such as those provided here, and it does work; however, when trying to achieve the same result with my true datasets, which have much more rows, I do not seem to be able to achieve the intended table.

    Would someone be so kind as to help me with this issue?

    Many thanks in advance

    GD

  • #2
    the values for the column x_0 are misplaced;
    I don't understand what that means. Can you show an example of the problematic outcome, and also explain (or show) what you want it to look like?

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you for your prompt answer, and for the suggestion to use dataex.

      My master dataset resembles the following one:
      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long Codefirm int(year sic) str2 CURRENCY long Cashandshortterminvestment
      1 1979 7999 "E"     .
      1 1980 7999 "E"     .
      1 1981 7999 "E"     .
      1 1982 7999 "E"     .
      1 1983 7999 "E"     .
      1 1984 7999 "E"     .
      1 1985 7999 "E"     .
      1 1986 7999 "E"     .
      1 1987 7999 "E"     .
      1 1988 7999 "E"     .
      1 1989 7999 "E"     .
      1 1990 7999 "E"     .
      1 1991 7999 "E"     .
      1 1992 7999 "E"     .
      1 1993 7999 "E"     .
      1 1994 7999 "E"     .
      1 1995 7999 "E"     .
      1 1996 7999 "E" 18482
      1 1997 7999 "E"  7532
      1 1998 7999 "E" 15697
      1 1999 7999 "E" 24095
      1 2000 7999 "E" 17879
      1 2001 7999 "E"   368
      1 2002 7999 "E"     .
      1 2003 7999 "E"     .
      1 2004 7999 "E"     .
      1 2005 7999 "E"     .
      1 2006 7999 "E"     .
      1 2007 7999 "E"     .
      1 2008 7999 "E"     .
      1 2009 7999 "E"     .
      1 2010 7999 "E"     .
      1 2011 7999 "E"     .
      1 2012 7999 "E"     .
      1 2013 7999 "E"     .
      1 2014 7999 "E"     .
      1 2015 7999 "E"     .
      1 2016 7999 "E"     .
      1 2017 7999 "E"     .
      1 2018 7999 "E"     .
      1 2019 7999 "E"     .
      2 1979 5045 "E"     .
      2 1980 5045 "E"     .
      2 1981 5045 "E"     .
      2 1982 5045 "E"     .
      2 1983 5045 "E"     .
      2 1984 5045 "E"     .
      2 1985 5045 "E"     .
      2 1986 5045 "E"     .
      2 1987 5045 "E"   215
      2 1988 5045 "E"  2276
      2 1989 5045 "E"  1856
      2 1990 5045 "E"  4571
      2 1991 5045 "E"  5079
      2 1992 5045 "E"  4821
      2 1993 5045 "E"  1508
      2 1994 5045 "E"   991
      2 1995 5045 "E" 22684
      2 1996 5045 "E" 18797
      2 1997 5045 "E" 23934
      2 1998 5045 "E"     .
      2 1999 5045 "E"     .
      2 2000 5045 "E"     .
      2 2001 5045 "E"     .
      2 2002 5045 "E"     .
      2 2003 5045 "E"     .
      2 2004 5045 "E"     .
      2 2005 5045 "E"     .
      2 2006 5045 "E"     .
      2 2007 5045 "E"     .
      2 2008 5045 "E"     .
      2 2009 5045 "E"     .
      2 2010 5045 "E"     .
      2 2011 5045 "E"     .
      2 2012 5045 "E"     .
      2 2013 5045 "E"     .
      2 2014 5045 "E"     .
      2 2015 5045 "E"     .
      2 2016 5045 "E"     .
      2 2017 5045 "E"     .
      2 2018 5045 "E"     .
      2 2019 5045 "E"     .
      3 1979 4812 "E"     .
      3 1980 4812 "E"     .
      3 1981 4812 "E"     .
      3 1982 4812 "E"     .
      3 1983 4812 "E"     .
      3 1984 4812 "E"     .
      3 1985 4812 "E"     .
      3 1986 4812 "E"     .
      3 1987 4812 "E"     .
      3 1988 4812 "E"     .
      3 1989 4812 "E"     .
      3 1990 4812 "E"     .
      3 1991 4812 "E"     .
      3 1992 4812 "E"     .
      3 1993 4812 "E"     .
      3 1994 4812 "E"     .
      3 1995 4812 "E"     .
      3 1996 4812 "E"     .
      end
      label values Codefirm Codefirm
      label def Codefirm 1 "142585", modify
      label def Codefirm 2 "143217", modify
      label def Codefirm 3 "258493", modify





      My "using dataset" goes something like that:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int(Codefirm year) long Cash
      1 1979        .
      1 1980        .
      1 1981        .
      1 1982        .
      1 1983        .
      1 1984        .
      1 1985        .
      1 1986        .
      1 1987      200
      1 1988     2276
      1 1989     1856
      1 1990     4571
      1 1991     5079
      1 1992     4821
      1 1993     1508
      1 1994      991
      1 1995    22684
      1 1996    18797
      1 1997    23934
      1 1998        .
      1 1999        .
      1 2000        .
      1 2001        .
      1 2002        .
      1 2003        .
      1 2004        .
      1 2005        .
      1 2006        .
      1 2007        .
      1 2008        .
      1 2009        .
      1 2010        .
      1 2011        .
      1 2012        .
      1 2013        .
      1 2014        .
      1 2015        .
      1 2016        .
      1 2017        .
      1 2018        .
      1 2019        .
      2 1979        .
      2 1980        .
      2 1981        .
      2 1982        .
      2 1983        .
      2 1984        .
      2 1985        .
      2 1986        .
      2 1987        .
      2 1988        .
      2 1989        .
      2 1990        .
      2 1991        .
      2 1992        .
      2 1993        .
      2 1994        .
      2 1995        .
      2 1996        .
      2 1997        .
      2 1998        .
      2 1999   194000
      2 2000 11612000
      2 2001   754000
      2 2002   828000
      2 2003        .
      2 2004        .
      2 2005        .
      2 2006        .
      2 2007        .
      2 2008        .
      2 2009        .
      2 2010        .
      2 2011        .
      2 2012        .
      2 2013        .
      2 2014        .
      2 2015        .
      2 2016        .
      2 2017        .
      2 2018        .
      2 2019        .
      3 1979        .
      3 1980        .
      3 1981        .
      3 1982        .
      3 1983        .
      3 1984        .
      3 1985        .
      3 1986        .
      3 1987        .
      3 1988        .
      3 1989        .
      3 1990        .
      3 1991        .
      3 1992        .
      3 1993        .
      3 1994        .
      3 1995        .
      3 1996        .
      end
      label values Codefirm Codefirm
      label def Codefirm 1 "143217", modify
      label def Codefirm 2 "258493", modify
      label def Codefirm 3 "259361", modify



      My purpose is to merge them such that all the observations identified by a couple (Codefirm, year) appear in a single dataset. Therefore, I have read the Stata documentation and found that "merge" could be a solution. I tried the following code:

      Code:
       merge 1:1 Codefirm year using "Using dataset.dta"
      However, this does not work since I instead obtain the following dataset, which is not what I want:


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long Codefirm int year long(Cashandshortterminvestment Cash) byte _merge
      1 1979     .        . 3
      1 1980     .        . 3
      1 1981     .        . 3
      1 1982     .        . 3
      1 1983     .        . 3
      1 1984     .        . 3
      1 1985     .        . 3
      1 1986     .        . 3
      1 1987     .      200 3
      1 1988     .     2276 3
      1 1989     .     1856 3
      1 1990     .     4571 3
      1 1991     .     5079 3
      1 1992     .     4821 3
      1 1993     .     1508 3
      1 1994     .      991 3
      1 1995     .    22684 3
      1 1996 18482    18797 3
      1 1997  7532    23934 3
      1 1998 15697        . 3
      1 1999 24095        . 3
      1 2000 17879        . 3
      1 2001   368        . 3
      1 2002     .        . 3
      1 2003     .        . 3
      1 2004     .        . 3
      1 2005     .        . 3
      1 2006     .        . 3
      1 2007     .        . 3
      1 2008     .        . 3
      1 2009     .        . 3
      1 2010     .        . 3
      1 2011     .        . 3
      1 2012     .        . 3
      1 2013     .        . 3
      1 2014     .        . 3
      1 2015     .        . 3
      1 2016     .        . 3
      1 2017     .        . 3
      1 2018     .        . 3
      1 2019     .        . 3
      2 1979     .        . 3
      2 1980     .        . 3
      2 1981     .        . 3
      2 1982     .        . 3
      2 1983     .        . 3
      2 1984     .        . 3
      2 1985     .        . 3
      2 1986     .        . 3
      2 1987   215        . 3
      2 1988  2276        . 3
      2 1989  1856        . 3
      2 1990  4571        . 3
      2 1991  5079        . 3
      2 1992  4821        . 3
      2 1993  1508        . 3
      2 1994   991        . 3
      2 1995 22684        . 3
      2 1996 18797        . 3
      2 1997 23934        . 3
      2 1998     .        . 3
      2 1999     .   194000 3
      2 2000     . 11612000 3
      2 2001     .   754000 3
      2 2002     .   828000 3
      2 2003     .        . 3
      2 2004     .        . 3
      2 2005     .        . 3
      2 2006     .        . 3
      2 2007     .        . 3
      2 2008     .        . 3
      2 2009     .        . 3
      2 2010     .        . 3
      2 2011     .        . 3
      2 2012     .        . 3
      2 2013     .        . 3
      2 2014     .        . 3
      2 2015     .        . 3
      2 2016     .        . 3
      2 2017     .        . 3
      2 2018     .        . 3
      2 2019     .        . 3
      3 1979     .        . 3
      3 1980     .        . 3
      3 1981     .        . 3
      3 1982     .        . 3
      3 1983     .        . 3
      3 1984     .        . 3
      3 1985     .        . 3
      3 1986     .        . 3
      3 1987     .        . 3
      3 1988     .        . 3
      3 1989     .        . 3
      3 1990     .        . 3
      3 1991     .        . 3
      3 1992     .        . 3
      3 1993     .        . 3
      3 1994     .        . 3
      3 1995     .        . 3
      3 1996     .        . 3
      end
      label values Codefirm Codefirm
      label def Codefirm 1 "142585", modify
      label def Codefirm 2 "143217", modify
      label def Codefirm 3 "258493", modify
      label values _merge _merge
      label def _merge 3 "matched (3)", modify
      I do not see what the problem might be. What is more, I have tried to run the same code with simpler datasets, and it worked. But as soon as I try the same procedure as the one in this post, something goes wrong, while I am not able to identify what.

      Thank you in advance for your help

      GD
      Last edited by Guillaume Dreyer; 11 Dec 2022, 14:45.

      Comment


      • #4
        Well, yes, that is what -merge- gives you. And it is probably what people typically have in mind when they think of combining two data sets like this.

        But you still haven't explained what result you want. Until you explain what you do want, nobody will be able to show you how to get it.
        Last edited by Clyde Schechter; 11 Dec 2022, 15:19.

        Comment


        • #5
          In fact, the result is not what one might expect, since the idea is to add a variable for every observation. In the example above, for the observation (143125; 1987), the result shoul be 215 in the column "Cashandshortterminvestment" and 200 in the colum "Cash", since those results are those that we find respectively in Master dataset.dta and in Using dataset.dta. However, after having used the -merge- code, we find the correct result for "Cashandshortterminvestment", but not for "Cash". In other words, what is expected is:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long Codefirm int(year sic) str1 CURRENCY int(Cashandshortterminvestment Cash)
          142585 1979 7999 "E"     .     .
          142585 1980 7999 "E"     .     .
          142585 1981 7999 "E"     .     .
          142585 1982 7999 "E"     .     .
          142585 1983 7999 "E"     .     .
          142585 1984 7999 "E"     .     .
          142585 1985 7999 "E"     .     .
          142585 1986 7999 "E"     .     .
          142585 1987 7999 "E"     .     .
          142585 1988 7999 "E"     .     .
          142585 1989 7999 "E"     .     .
          142585 1990 7999 "E"     .     .
          142585 1991 7999 "E"     .     .
          142585 1992 7999 "E"     .     .
          142585 1993 7999 "E"     .     .
          142585 1994 7999 "E"     .     .
          142585 1995 7999 "E"     .     .
          142585 1996 7999 "E" 18482     .
          142585 1997 7999 "E"  7532     .
          142585 1998 7999 "E" 15697     .
          142585 1999 7999 "E" 24095     .
          142585 2000 7999 "E" 17879     .
          142585 2001 7999 "E"   368     .
          142585 2002 7999 "E"     .     .
          142585 2003 7999 "E"     .     .
          142585 2004 7999 "E"     .     .
          142585 2005 7999 "E"     .     .
          142585 2006 7999 "E"     .     .
          142585 2007 7999 "E"     .     .
          142585 2008 7999 "E"     .     .
          142585 2009 7999 "E"     .     .
          142585 2010 7999 "E"     .     .
          142585 2011 7999 "E"     .     .
          142585 2012 7999 "E"     .     .
          142585 2013 7999 "E"     .     .
          142585 2014 7999 "E"     .     .
          142585 2015 7999 "E"     .     .
          142585 2016 7999 "E"     .     .
          142585 2017 7999 "E"     .     .
          142585 2018 7999 "E"     .     .
          142585 2019 7999 "E"     .     .
          143217 1979 5045 "E"     .     .
          143217 1980 5045 "E"     .     .
          143217 1981 5045 "E"     .     .
          143217 1982 5045 "E"     .     .
          143217 1983 5045 "E"     .     .
          143217 1984 5045 "E"     .     .
          143217 1985 5045 "E"     .     .
          143217 1986 5045 "E"     .     .
          143217 1987 5045 "E"   215   200
          143217 1988 5045 "E"  2276  2276
          143217 1989 5045 "E"  1856  1856
          143217 1990 5045 "E"  4571  4571
          143217 1991 5045 "E"  5079  5079
          143217 1992 5045 "E"  4821  4821
          143217 1993 5045 "E"  1508  1508
          143217 1994 5045 "E"   991   991
          143217 1995 5045 "E" 22684 22684
          143217 1996 5045 "E" 18797 18797
          143217 1997 5045 "E" 23934 23934
          143217 1998 5045 "E"     .     .
          143217 1999 5045 "E"     .     .
          143217 2000 5045 "E"     .     .
          143217 2001 5045 "E"     .     .
          143217 2002 5045 "E"     .     .
          143217 2003 5045 "E"     .     .
          143217 2004 5045 "E"     .     .
          143217 2005 5045 "E"     .     .
          143217 2006 5045 "E"     .     .
          143217 2007 5045 "E"     .     .
          143217 2008 5045 "E"     .     .
          143217 2009 5045 "E"     .     .
          143217 2010 5045 "E"     .     .
          143217 2011 5045 "E"     .     .
          143217 2012 5045 "E"     .     .
          143217 2013 5045 "E"     .     .
          143217 2014 5045 "E"     .     .
          143217 2015 5045 "E"     .     .
          143217 2016 5045 "E"     .     .
          143217 2017 5045 "E"     .     .
          143217 2018 5045 "E"     .     .
          143217 2019 5045 "E"     .     .
          258493 1979 4812 "E"     .     .
          258493 1980 4812 "E"     .     .
          258493 1981 4812 "E"     .     .
          258493 1982 4812 "E"     .     .
          258493 1983 4812 "E"     .     .
          258493 1984 4812 "E"     .     .
          258493 1985 4812 "E"     .     .
          258493 1986 4812 "E"     .     .
          258493 1987 4812 "E"     .     .
          258493 1988 4812 "E"     .     .
          258493 1989 4812 "E"     .     .
          258493 1990 4812 "E"     .     .
          258493 1991 4812 "E"     .     .
          258493 1992 4812 "E"     .     .
          258493 1993 4812 "E"     .     .
          258493 1994 4812 "E"     .     .
          258493 1995 4812 "E"     .     .
          258493 1996 4812 "E"     .     .
          end
          ,

          and not what we actually get.

          In short, I want to add the variable "Cash" to the observations of my Master dataset while also keeping the observations from my "Using dataset".

          I am not sure why -merge- does not provide the expected result; it seems that the variables in the second column ("Cash") are randomly distributed to the observations. For instance, for the observation (Codefirm=1142585;year=1987), the variable "Cash" takes the value 200. However, in the "Using dataset", you can see that this exact same observation takes the value 0 for the variable "Cash". I do not understand why this is so.

          Last edited by Guillaume Dreyer; 11 Dec 2022, 15:46. Reason: Precision about the question

          Comment


          • #6
            Aha, I see the problem. These data sets are not compatible for merging in their current form. The variable Codefirm is set up differently in the two data sets. In fact, Codefirm never takes the value 143127 in either the master or the using data sets. The values of Codefirm in those data sets are numbers like 1, 2, 3. They look like 143217 or similar numbers when you list or browse the data sets because they have value labels attached to them. But the value labels in the two data sets are different! 143217 is the label for Codefirm 1 in the master data set and for Codefirm 2 in the using data set. When Stata does the -merge- it does not pay attention to the labels--those are cosmetic only. Rather it matches codefirm 1 with codefirm 1, and codefirm 2 with codefirm 2, etc. Because the labels are different, this results in mismatches when you are expecting them to match based on the labels.

            It was probably a bad idea for whoever create these data sets to structure the Codefirm variable in this way. I'm guessing it was done with -encode-. And -encode-ing two different data sets with the "same" variable is always hazardous, because when -encode- works on one data set, it has no way to know about the values of the "same" variable in any other data set. So mismatches of this type occur.

            Anyway, the solution is to undo the encoding and go back to a variable that reflects the labels instead of the underlying codes.:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input long Codefirm int(year sic) str2 CURRENCY long Cashandshortterminvestment
            1 1979 7999 "E"     .
            1 1980 7999 "E"     .
            1 1981 7999 "E"     .
            1 1982 7999 "E"     .
            1 1983 7999 "E"     .
            1 1984 7999 "E"     .
            1 1985 7999 "E"     .
            1 1986 7999 "E"     .
            1 1987 7999 "E"     .
            1 1988 7999 "E"     .
            1 1989 7999 "E"     .
            1 1990 7999 "E"     .
            1 1991 7999 "E"     .
            1 1992 7999 "E"     .
            1 1993 7999 "E"     .
            1 1994 7999 "E"     .
            1 1995 7999 "E"     .
            1 1996 7999 "E" 18482
            1 1997 7999 "E"  7532
            1 1998 7999 "E" 15697
            1 1999 7999 "E" 24095
            1 2000 7999 "E" 17879
            1 2001 7999 "E"   368
            1 2002 7999 "E"     .
            1 2003 7999 "E"     .
            1 2004 7999 "E"     .
            1 2005 7999 "E"     .
            1 2006 7999 "E"     .
            1 2007 7999 "E"     .
            1 2008 7999 "E"     .
            1 2009 7999 "E"     .
            1 2010 7999 "E"     .
            1 2011 7999 "E"     .
            1 2012 7999 "E"     .
            1 2013 7999 "E"     .
            1 2014 7999 "E"     .
            1 2015 7999 "E"     .
            1 2016 7999 "E"     .
            1 2017 7999 "E"     .
            1 2018 7999 "E"     .
            1 2019 7999 "E"     .
            2 1979 5045 "E"     .
            2 1980 5045 "E"     .
            2 1981 5045 "E"     .
            2 1982 5045 "E"     .
            2 1983 5045 "E"     .
            2 1984 5045 "E"     .
            2 1985 5045 "E"     .
            2 1986 5045 "E"     .
            2 1987 5045 "E"   215
            2 1988 5045 "E"  2276
            2 1989 5045 "E"  1856
            2 1990 5045 "E"  4571
            2 1991 5045 "E"  5079
            2 1992 5045 "E"  4821
            2 1993 5045 "E"  1508
            2 1994 5045 "E"   991
            2 1995 5045 "E" 22684
            2 1996 5045 "E" 18797
            2 1997 5045 "E" 23934
            2 1998 5045 "E"     .
            2 1999 5045 "E"     .
            2 2000 5045 "E"     .
            2 2001 5045 "E"     .
            2 2002 5045 "E"     .
            2 2003 5045 "E"     .
            2 2004 5045 "E"     .
            2 2005 5045 "E"     .
            2 2006 5045 "E"     .
            2 2007 5045 "E"     .
            2 2008 5045 "E"     .
            2 2009 5045 "E"     .
            2 2010 5045 "E"     .
            2 2011 5045 "E"     .
            2 2012 5045 "E"     .
            2 2013 5045 "E"     .
            2 2014 5045 "E"     .
            2 2015 5045 "E"     .
            2 2016 5045 "E"     .
            2 2017 5045 "E"     .
            2 2018 5045 "E"     .
            2 2019 5045 "E"     .
            3 1979 4812 "E"     .
            3 1980 4812 "E"     .
            3 1981 4812 "E"     .
            3 1982 4812 "E"     .
            3 1983 4812 "E"     .
            3 1984 4812 "E"     .
            3 1985 4812 "E"     .
            3 1986 4812 "E"     .
            3 1987 4812 "E"     .
            3 1988 4812 "E"     .
            3 1989 4812 "E"     .
            3 1990 4812 "E"     .
            3 1991 4812 "E"     .
            3 1992 4812 "E"     .
            3 1993 4812 "E"     .
            3 1994 4812 "E"     .
            3 1995 4812 "E"     .
            3 1996 4812 "E"     .
            end
            label values Codefirm Codefirm
            label def Codefirm 1 "142585", modify
            label def Codefirm 2 "143217", modify
            label def Codefirm 3 "258493", modify
            tempfile master
            save `master'
            
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(Codefirm year) long Cash
            1 1979        .
            1 1980        .
            1 1981        .
            1 1982        .
            1 1983        .
            1 1984        .
            1 1985        .
            1 1986        .
            1 1987      200
            1 1988     2276
            1 1989     1856
            1 1990     4571
            1 1991     5079
            1 1992     4821
            1 1993     1508
            1 1994      991
            1 1995    22684
            1 1996    18797
            1 1997    23934
            1 1998        .
            1 1999        .
            1 2000        .
            1 2001        .
            1 2002        .
            1 2003        .
            1 2004        .
            1 2005        .
            1 2006        .
            1 2007        .
            1 2008        .
            1 2009        .
            1 2010        .
            1 2011        .
            1 2012        .
            1 2013        .
            1 2014        .
            1 2015        .
            1 2016        .
            1 2017        .
            1 2018        .
            1 2019        .
            2 1979        .
            2 1980        .
            2 1981        .
            2 1982        .
            2 1983        .
            2 1984        .
            2 1985        .
            2 1986        .
            2 1987        .
            2 1988        .
            2 1989        .
            2 1990        .
            2 1991        .
            2 1992        .
            2 1993        .
            2 1994        .
            2 1995        .
            2 1996        .
            2 1997        .
            2 1998        .
            2 1999   194000
            2 2000 11612000
            2 2001   754000
            2 2002   828000
            2 2003        .
            2 2004        .
            2 2005        .
            2 2006        .
            2 2007        .
            2 2008        .
            2 2009        .
            2 2010        .
            2 2011        .
            2 2012        .
            2 2013        .
            2 2014        .
            2 2015        .
            2 2016        .
            2 2017        .
            2 2018        .
            2 2019        .
            3 1979        .
            3 1980        .
            3 1981        .
            3 1982        .
            3 1983        .
            3 1984        .
            3 1985        .
            3 1986        .
            3 1987        .
            3 1988        .
            3 1989        .
            3 1990        .
            3 1991        .
            3 1992        .
            3 1993        .
            3 1994        .
            3 1995        .
            3 1996        .
            end
            label values Codefirm Codefirm
            label def Codefirm 1 "143217", modify
            label def Codefirm 2 "258493", modify
            label def Codefirm 3 "259361", modify
            tempfile using
            save `using'
            
            //  UNLABEL THE CODEFIRM VARIABLE SO THAT MATCHING BASED ON WHAT IS CURRENTLY THEIR
            //  LABELS IS POSSIBLE
            use `using', clear
            decode Codefirm, gen(codefirm)
            drop Codefirm
            tempfile holding
            save `holding'
            
            use `master', clear
            decode Codefirm, gen(codefirm)
            drop Codefirm
            
            //  NOW MERGE THEM
            merge 1:1 codefirm year using `holding'
            This now produces the kind of results you were expecting. The string variable codefirm replaces the original Codefirm variables which were conflictingly labeled. If you need a numeric Codefirm variable (and if there are no other data sets you plan to combine with this one down the line) it is now safe to -encode codefirm, gen(Codefirm)- for that purpose.

            To avoid repetitions of this problem in the future I recommend you get the -precombine- program, by Mark Chatfield, available at SSC. Use it before combining data sets (whether by -merge-, -append-, or -joinby-) and it will alert you of this kind, and other kinds, of incompatability between them.

            For primary prevention of this kind of problem, avoid -encode-ing "the same" variable in two different data sets. Only -encode- after the data sets have been combined.
            Last edited by Clyde Schechter; 11 Dec 2022, 16:01.

            Comment


            • #7
              Many thanks for your kind explanations and please have a nice day

              GD

              Comment

              Working...
              X