It is common to have panels with two or more group dimensions, for example companies and workers.
How to efficiently reshape this to a wide format, eg a set of variables for all workers in a company? Ideally, I would like to add both identifier in the j() placeholder but this is not allowed.
Here is some code how I often do it and I presume it's super-inefficient. Egen = group() and merge are both commands that take forever on large datasets. How to accomplish such a task more efficiently?
How to efficiently reshape this to a wide format, eg a set of variables for all workers in a company? Ideally, I would like to add both identifier in the j() placeholder but this is not allowed.
Here is some code how I often do it and I presume it's super-inefficient. Egen = group() and merge are both commands that take forever on large datasets. How to accomplish such a task more efficiently?
Code:
clear input pid t eid 1 1 1 1 1 2 1 2 2 1 3 1 end gegen i = group(pid t) preserve keep i eid gduplicates drop bysort i : gen j = _n reshape wide eid, i(i) j(j) save temp.dta, replace restore keep pid t i gduplicates drop merge 1:1 i using temp.dta drop _merge i
Comment