vlookup analogues in Stata

Marco Tacchi

Join Date: Feb 2019

Posts: 24
#1

vlookup analogues in Stata

07 Sep 2023, 09:48

Hello,

My raw data on subsidiaries and their parent companies is structured in two columns like this:

Code:

Company Parent Company1 Company2 Company1 Company3 Company2 Company3

As you could see, Company3 is actually the ultimate parent, while Company2 is an intermediate parent, with Company1 being the lowest in corporate hierarchy. This is unfortunately how my dataset was delivered to me.

What I ultimately would like to have is:

Code:

Company Parent UltimateParent Company1 Company2 Company3 Company1 Company3 Company3 Company2 Company3 Company3

So I could remove row 2 by eliminating duplicates later.

I know that this can be solved by manually removing row 2 and then using vlookup in Stata. However my raw dataset is not small. vlookup also does not work in the raw setup, giving the following error: "there are multiple observations with different Parent within Company."

Thank you for your help.
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2421
#2

07 Sep 2023, 10:57

While the user-written module -vlookup- might be helpful to you, I think the more important issue here is what the logical structure is that links companies and parents in your data set. I can't construct exactly what your rules are that define company-parent relationships from the limited example you show. I'd also want to know something about how "deep" this parent/company linkage can be. For example, I can imagine company10 being the parent of company9, which is the parent of company8, which is the parent of ..., company1. I'd presume you know what the limits are on this kind of thing.

So, if you want to get better help, from anyone here, I'd suggest that you present: 1) a more extensive data example (using the -dataex- command, as described in section 12.2 of the StataList FAQ) and 2) a rigorous description of what defines parent/company relationships here.

My guess is that commands such as -joinby- or -cross- or -merge- are likely to be important for your situation, but it's hard to tell for sure.
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2405
#3

07 Sep 2023, 11:25

I started writing a response and then abandoned it. Mike has summarized the issues quite nicely.

I would suggest that -merge- is the way to go here, but your nesting structure has issues. For however many levels of nesting you have, the relationship must always be 1:1, where one company is nested within a maximum of 1 company. In your example data, Company1 is nested under both Company2 and Company3. This type of crossed structure should be avoided and will be difficult to program around, otherwise.
Comment
Marco Tacchi

Join Date: Feb 2019

Posts: 24
#4

07 Sep 2023, 13:34

Dear Mike and Leonardo,

Thank you for your prompt replies. -merge- looks like a reasonable solution and I will try to find a way to work around it.
Comment

Announcement

vlookup analogues in Stata

Comment

Comment

Comment