Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "Horizontal Collapse"

    Hey Statalisters,

    after some hours of searching i did not find anything online that could solve my problem.

    I am looking for some kind of "horizontal collapse".

    My data looks as follows. There are various identifying variables upfront, what follows are up to ten combinations per firm of a segment code with another corresponding variable (here sic and sales). The row reports these values for 2 firms (am and tm), i.e. there are up to 20 observations per row (10* (sics + corresponding sales) for 2 firms).
    ID am_sics1 am_sales1 am_sics2 am_sales2 tm_sics1 tm_sales1 tm_sics2 tm_sales2 ...
    1 11111 2323232 1213 12323 232323 232343 11111 434343 ...
    2 ... ... ... ... ... ... ... ... ...
    I would now like to collapse the data horizontally, so that if between the combinations any sic codes are duplicates, the sales will be summarized (added on top of each other) and only one sic code plus the added sales remain. Obviously all other unique sic codes and their respective sales number should be kept the way they are. For the "collapsed" variables and the unique ones new variable combinations should be created, that show the combined firm.

    The result should be something like this for the above stated example
    ID new_sics1 new_sales1 new_sics2 new_sales2 new_sics3 new_sales3 ...
    1 11111 2757575 1213 12323 232323 232343 ...
    2 ... ... ... ... ... ... ...

    I would be very thankful for any hints or tips!

    Best regards,
    Lennart

  • #2
    I don't know any short way of doing this with this wide layout except adding variables and then dropping what is now redundant. But you should not drop variables if you still want their values in some observations.

    A slightly deeper point is that this would be easier with a long layout, because then it would be a collapse.

    Comment


    • #3
      Your first solution is probably what im looking for. I still need the observation for the firms titled with "am_..." but then i also need the combination of the two as "new_...".

      How would i go about this?

      Thanks, Nick

      Comment


      • #4
        Nick's first solution may be what you're looking for. But his second solution is probably better. This is a no-brainer if you -reshape long-. And it will leave you with a long layout data set that will probably be more suitable for whatever subsequent analyses you want to do as well. Wide layout data is difficult to work with in Stata; only a few commands are optimized for working with it.

        If, in the end, you really need a wide-layout, you can always -reshape wide- after you've done everything else.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte ID int am_sics1 long am_sales1 int(am_sics2 am_sales2) long(tm_sics1 tm_sales1) int tm_sics2 long tm_sales2
        1 11111 2323232 1213 12323 232323 232343 11111 434343
        end
        
        //    RENAME VARIABLES TO FACILITATE RESHAPE
        rename am_* *_am
        rename tm_* *_tm
        
        //    GO LONG
        reshape long sics sales, i(ID) j(_j) string
        
        //    SEPARATE FIRM FROM 1,2,...
        split _j, parse("_") gen(a)
        rename a2 firm
        rename a1 seq
        
        //    SUM SALES OVER SIC
        collapse (sum) sales, by(ID sics)
        
        //    IF REALLY NEEDED, GO BACK TO WIDE
        by ID: gen _j = _n
        reshape wide sales sics, i(ID) j(_j)
        In addition, for the future, please always post example data using the -dataex- command, as I have done here. Importing the table you showed into Stata in order to try out and test the code took longer than actually developing the solution. Install -dataex- by running -ssc install dataex-. Then run -help datatex- to read the instructions for using it. When you use -dataex-, those who want to help you can import your example data into Stata with a simple copy/paste operation, and the result is a 100% faithful replica of what you are working with.

        Comment

        Working...
        X