Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combine values

    Hi,

    I am using Stata v.14. I hope someone can help me with this.

    The following is my data example.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2 parent_id str3 subsidiary_id str2 country int(year sales) byte tag
    "AA" "AA1" "US" 2000 100 0
    "AA" "AA1" "US" 2001 110 0
    "AA" "AA1" "US" 2002 120 0
    "AA" "AA1" "US" 2003 130 0
    "BB" "BB1" "GB" 2000 200 1
    "BB" "BB1" "GB" 2001 210 1
    "BB" "BB1" "GB" 2002 220 1
    "BB" "BB1" "GB" 2003 230 1
    "BB" "BB2" "GB" 2000 240 1
    "BB" "BB2" "GB" 2001 250 1
    "BB" "BB2" "GB" 2002 260 1
    "BB" "BB2" "GB" 2003 270 1
    "CC" "CC1" "GB" 2000 300 0
    "CC" "CC1" "GB" 2001 310 1
    "CC" "CC1" "GB" 2002 320 1
    "CC" "CC1" "GB" 2003 330 1
    "CC" "CC2" "GB" 2001 340 1
    "CC" "CC2" "GB" 2002 350 1
    "CC" "CC2" "GB" 2003 360 1
    "DD" "DD1" "CA" 2000 400 2
    "DD" "DD1" "CA" 2001 410 2
    "DD" "DD1" "CA" 2002 420 2
    "DD" "DD1" "CA" 2003 430 2
    "DD" "DD2" "CA" 2000 440 2
    "DD" "DD2" "CA" 2001 450 2
    "DD" "DD2" "CA" 2002 460 2
    "DD" "DD2" "CA" 2003 470 2
    "DD" "DD3" "CA" 2000 480 2
    "DD" "DD3" "CA" 2001 490 2
    "DD" "DD3" "CA" 2002 500 2
    "DD" "DD3" "CA" 2003 510 2
    "EE" "EE1" "CA" 2000 600 1
    "EE" "EE1" "CA" 2001 610 2
    "EE" "EE1" "CA" 2002 620 2
    "EE" "EE1" "CA" 2003 630 0
    "EE" "EE2" "CA" 2001 640 2
    "EE" "EE2" "CA" 2002 650 2
    "EE" "EE3" "CA" 2000 660 1
    "EE" "EE3" "CA" 2001 670 2
    "EE" "EE3" "CA" 2002 680 2
    end
    ​​​​​​​
    • parent_id: parent company id
    • subsidiary_id: subsidiary company id
    • tag: it is generated by using "duplicates tag parent_id year, gen(tag)"
    "tag" 1 or 2 implies that more than one subsidiary company is under one parent company in a given year. If "tag" equals 1 or 2, I would like to add sales of these subsidiaries in the given year.

    What I need to create is the following:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str2(parent_id country) int(year sales)
    "AA" "US" 2000  100
    "AA" "US" 2001  110
    "AA" "US" 2002  120
    "AA" "US" 2003  130
    "BB" "GB" 2000  440
    "BB" "GB" 2001  460
    "BB" "GB" 2002  480
    "BB" "GB" 2003  500
    "CC" "GB" 2000  300
    "CC" "GB" 2001  650
    "CC" "GB" 2002  670
    "CC" "GB" 2003  690
    "DD" "CA" 2000 1320
    "DD" "CA" 2001 1350
    "DD" "CA" 2002 1380
    "DD" "CA" 2003 1410
    "EE" "CA" 2000 1260
    "EE" "CA" 2001 1920
    "EE" "CA" 2002 1950
    "EE" "CA" 2003  630
    end
    Thank you so much in advance.

  • #2
    In your example data, for any given parent company, all of the subsidiaries have the same value for country. The code below verifies that this is always true in the data set. (If it isn't, the code will break at the -assert- command. Also, if it isn't true, you need to specify how to decide which subsidiary's country should be used in the end result.)

    Code:
    by parent_id (country), sort: assert country[1] == country[_N]
    collapse (sum) sales (first) country, by(parent_id year)
    order country, after(parent_id)
    Note, by the way, that the variable tag is not needed for this.

    Comment


    • #3
      Hi Clyde,

      I had to fix my data because the code broke at the assert command. Then, it works perfectly. Thank you for your help.

      Thanks again.

      Comment

      Working...
      X