Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • vlookup analogues in Stata

    Hello,

    My raw data on subsidiaries and their parent companies is structured in two columns like this:
    Code:
    Company    Parent
    Company1    Company2
    Company1    Company3
    Company2    Company3
    As you could see, Company3 is actually the ultimate parent, while Company2 is an intermediate parent, with Company1 being the lowest in corporate hierarchy. This is unfortunately how my dataset was delivered to me.

    What I ultimately would like to have is:
    Code:
    Company    Parent    UltimateParent
    Company1    Company2    Company3
    Company1    Company3    Company3
    Company2    Company3    Company3
    So I could remove row 2 by eliminating duplicates later.

    I know that this can be solved by manually removing row 2 and then using vlookup in Stata. However my raw dataset is not small. vlookup also does not work in the raw setup, giving the following error: "there are multiple observations with different Parent within Company."

    Thank you for your help.

  • #2
    While the user-written module -vlookup- might be helpful to you, I think the more important issue here is what the logical structure is that links companies and parents in your data set. I can't construct exactly what your rules are that define company-parent relationships from the limited example you show. I'd also want to know something about how "deep" this parent/company linkage can be. For example, I can imagine company10 being the parent of company9, which is the parent of company8, which is the parent of ..., company1. I'd presume you know what the limits are on this kind of thing.

    So, if you want to get better help, from anyone here, I'd suggest that you present: 1) a more extensive data example (using the -dataex- command, as described in section 12.2 of the StataList FAQ) and 2) a rigorous description of what defines parent/company relationships here.

    My guess is that commands such as -joinby- or -cross- or -merge- are likely to be important for your situation, but it's hard to tell for sure.

    Comment


    • #3
      I started writing a response and then abandoned it. Mike has summarized the issues quite nicely.

      I would suggest that -merge- is the way to go here, but your nesting structure has issues. For however many levels of nesting you have, the relationship must always be 1:1, where one company is nested within a maximum of 1 company. In your example data, Company1 is nested under both Company2 and Company3. This type of crossed structure should be avoided and will be difficult to program around, otherwise.

      Comment


      • #4
        Dear Mike and Leonardo,

        Thank you for your prompt replies. -merge- looks like a reasonable solution and I will try to find a way to work around it.

        Comment

        Working...
        X