Transpose dataset

Oystein Linnestad

Join Date: Oct 2016

Posts: 6
#1

Transpose dataset

28 May 2018, 03:03

Dear Statalist

I have two files I’d like to merge.

File 1 is in wide format containing accounting data for 300 000 companies (legal entities). Company identifier is variable ID_company, and 1 record = 1 company (no duplicates). The dataset does not contain the variable Employee. File 1 contains no information about subunits.

File 2 is in long format containing all subunits for the companies in file 1 and the number of employees. Identifier for subunits is ID_subunit and the number of employee for each subunits is in the variable Employee. File 2 contains no duplicates on subunits.

Example: File 2 in long format

ID_company ID_subunit Employee

998877665 766554433 5

998877665 655443322 12

998877665 544332211 23

The value of employees is recorded per subunit and not per company. This explains the data structure as mentioned.

The problem arises in merging these files. It is possible to merge these files, and keep the records from using dataset. But then more problems arises later in the analysis because we need to keep the information about number of employees per subunit (we later split income and costs to geographical location for each subunit and we need Employee to do this – it’s roughly the same procedure as in national accounts).

Does anyone have a solution in STATA as how to transpose File 2 from long to wide format with different number of subunits per company? With same data structure in File 1 and File 2, then it is easier to merge the files and use the variable Employee in the analysis that follows.

I appreciate any thoughts.

Example: File 2 in wide format

ID_company ID_subunit_1 Employee_1 ID_subunit_2 Employee_2 ID_subunit_3 Employee_3

998877665 766554433 5 655443322 12 544332211 23

Thanks,

Oystein Linnestad
Norway
Tags: None

Maarten Buis

Join Date: Mar 2014
Posts: 3447

28 May 2018, 03:46

Code:

. clear

. input long compid long unitid empl

           compid        unitid       empl
  1. 998877665       766554433       5
  2. 998877665       655443322       12
  3. 998877665       544332211       23
  4. 998877666       655443322       12
  5. 998877666       544332211       23
  6. end

.
. list, sepby(compid)

     +------------------------------+
     |    compid      unitid   empl |
     |------------------------------|
  1. | 998877665   766554433      5 |
  2. | 998877665   655443322     12 |
  3. | 998877665   544332211     23 |
     |------------------------------|
  4. | 998877666   655443322     12 |
  5. | 998877666   544332211     23 |
     +------------------------------+

.
. bys compid (unitid) : gen order = _n

. reshape wide unitid empl, i(compid) j(order)
(note: j = 1 2 3)

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                        5   ->       2
Number of variables                   4   ->       7
j variable (3 values)             order   ->   (dropped)
xij variables:
                                 unitid   ->   unitid1 unitid2 unitid3
                                   empl   ->   empl1 empl2 empl3
-----------------------------------------------------------------------------

.
. l

     +-----------------------------------------------------------------------+
     |    compid     unitid1   empl1     unitid2   empl2     unitid3   empl3 |
     |-----------------------------------------------------------------------|
  1. | 998877665   544332211      23   655443322      12   766554433       5 |
  2. | 998877666   544332211      23   655443322      12           .       . |
     +-----------------------------------------------------------------------+

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Announcement

Comment

ID_company	ID_subunit_1	Employee_1	ID_subunit_2	Employee_2	ID_subunit_3	Employee_3
998877665	766554433	5	655443322	12	544332211	23