Combining two datasets with non unique identifiers

Archita Sarmah

Join Date: May 2015
Posts: 6

Combining two datasets with non unique identifiers

09 Aug 2015, 06:31

Dear Members,

I have a problem that pertains to combining two datasets that do not have a unique identifier. My datasets contains information on around 250 firms, their competitors and the competitors’ competitors for a time-period of 34 years.

Dataset 1 has information on firm-competitor dyads, market and year. A simple representative form of the dataset is:

SampleFirm	Market1	Year	CompetitorFirm
Firm A	M1	t1	Firm C
Firm A	M1	t1	Firm D

Dataset 2 has information on competitor- competitor’s competitor dyads, market and year for the same time frame. A simple representative form of this dataset is:

CompetitorFirm	Market2	Year	Competitor’sCompetitorFirm
Firm C	M3	t1	Firm L
Firm C	M4	t1	Firm M
Firm D	M5	t1	Firm N
Firm D	M6	t1	Firm O

For analyses, I seek a dataset that links the Sample Firm to their Competitors and the Competitors’ Competitors. The final dataset would ideally be in the following form:

SampleFirm	Market1	Year	CompetitorFirm	Competitor’sCompetitorFirm	Market2
Firm A	M1	t1	Firm C	Firm L	M3
Firm A	M1	t1	Firm C	Firm M	M4
Firm A	M1	t1	Firm D	Firm N	M5
Firm A	M1	t1	Firm D	Firm O	M6

This would require merging Dataset 1 and Dataset 2. However, the issue is that there is no unique identifier between the two datasets. I tried a m:m merge using CompetitorFirm and Year. But the matching was not correct. Could any of the members please suggest another way through which the output in the final dataset could be obtained?

Thanks,
Archita

Last edited by Archita Sarmah; 09 Aug 2015, 06:45.

Tags: None

William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

09 Aug 2015, 07:39

I think instead of merge m:m you want joinby CompetitorFirm Year . See help joinby for details, or the full documentation for merge and joinby in the Stata Data-Manager Reference Manual PDF available from the PDF Documentation item of Stata's Help menu.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#3

09 Aug 2015, 09:42

I'm not sure if joinby is the correct solution here. Should the competitors' competitors be joined only in cases where they are competing in the same market? What would be the reason to compare an outcome of firm A in market X in time T to the performance of firm B in market Y in time T (I'm assuming by markets its a reference to say agribusiness vs consumer technologies). In that case you would likely want to add the market variable to your merge command. In either case, you can get the most help if you are able to either provide an example that others can reproduce or provide the exact code you used and the exact response you received from Stata.
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

09 Aug 2015, 14:35

The following code applies joinby to the sample data provided in post #1 and produces the sample output displayed in post #1.

Code:

clear
input str8 (Firm  Market1 Year CFirm)
FirmA  M1  t1  FirmC
FirmA  M1  t1  FirmD
end
tempfile d1
save `d1'

clear
input str8 (CFirm Market2 Year CCFirm)
FirmC  M3  t1  FirmL
FirmC  M4  t1  FirmM
FirmD  M5  t1  FirmN
FirmD  M6  t1  FirmO
end
tempfile d2
save `d2'

use `d1', clear
joinby CFirm Year using `d2'
list, clean noobs

Code:

     Firm   Market1   Year   CFirm   Market2   CCFirm  
    FirmA        M1     t1   FirmC        M3    FirmL  
    FirmA        M1     t1   FirmC        M4    FirmM  
    FirmA        M1     t1   FirmD        M5    FirmN  
    FirmA        M1     t1   FirmD        M6    FirmO

Comment

Archita Sarmah

Join Date: May 2015

Posts: 6
#5

10 Aug 2015, 02:13

@William Lisowski: Thanks a lot for your help. I too ran the joinby command for a small chunk of my data. It is working well.

@WBuchanan: Thanks a lot for your comment.For the purposes of the theory that I seek to test, the competitors' competitors need not be joined only in those cases where they are competing in the same market. However, this is not the case for the competitors of the focal firm.They have to be present in the same market as the focal firm in a given year.

Best,
Archita
Comment

Announcement

Combining two datasets with non unique identifiers

Comment

Comment

Comment

Comment