Reshaping long

Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#1

Reshaping long

30 Oct 2018, 03:37

Hi all!

I have a problem in reshaping long a dataset. In particular data look like this, where _03, _06, _09, _12 represent the quarters of a year:
idproduct Other_string_variables sales_03_2004 sales_06_2004 ... sales_12_2015
1 "NAMES" 89 90 12

My aim is to reshape long and obtain something like this:

idproducts quarters sales
1 sales_03_2004 89
1 sales_06_2004 90
1 sales_09_2004 112
1 ....
1 ....
1 sales_12_2015 12

Any idea?

P.s. I have renamed all the variables salesq12004, sales q22004... and then used the code:

Code:

reshape long sales, i(idproduct) j(quarter) string gen quarter = quarterly(quarter, "QY") format quarter %tq gen Year = year(dofq(quarter)) list

But it displays the following error:

variable id does not uniquely identify the observations
Your data are currently wide. You are performing a
reshape long. You specified i(idproduct) and
j(quarter). In the current wide form, variable
idproduct should uniquely identify the observations.
Remember this picture:

Many thanks!

Federico

Last edited by Federico Nutarelli; 30 Oct 2018, 04:28.
Tags: None
Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#2

30 Oct 2018, 04:49

EDIT: the latter problem is maybe due to the fact that idproduct is obtained grouping by the name of the product...but actually have no clue on how to go on
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

30 Oct 2018, 07:55

The problem you are encountering is that the reshape long command wants to be able to identify the original "wide observation" from which each "long observation" originally came, so that you could "reshape wide" to return your "long layout" data to its original "wide layout" with a subsequent reshape wide. The error message tells you that in your wide layout you have products that have more than one observation, so that idproduct does uniquely identify wide observations, and it would not be possible to put the long observations back together into wide observations.

There are two approaches to solving this problem.

If perhaps there is another variable that you haven't told us about - say, the store in which the product is sold - which when taken together with the product uniquely identifies observations, you could include it in the i() option, for example

Code:

reshape long sales, i(idproduct store) j(quarter) string

The other approach is to just make up a unique id and use it in the i() option, for example

Code:

generate obsnum = _n reshape long sales, i(obsnum) j(quarter) string

This won't be a problem, because reshape long will copy any variables that aren't being reshaped - idproduct in this case - to the long observations. So your long layout will have an extra variable - obsnum - as well as idproduct, and actually, if you later discover an unusual value - say, negative sales - you'll be able to use obsnum to find the wide observation from which it came in your original data.

Last edited by William Lisowski; 30 Oct 2018, 07:59.
Comment
Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#4

30 Oct 2018, 08:42

Yes at the end I adopted the second approach and everything worked.
Many many thanks William you are always very kind and precise in answering my questions!

Now that I have the variable reshaped, I would like to generate a counter taking 1 the first time, by(idproduct), that sales is positive.
Should I go for this solution? https://www.stata.com/support/faqs/d...t-occurrences/

Many thanks again!
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

30 Oct 2018, 09:23

Yes, any advice authored by Nick Cox (15,890 posts on Statalist as I write this) is advice I would not attempt to improve upon.
1 like
Comment
Federico Nutarelli

Join Date: Sep 2018

Posts: 430
#6

30 Oct 2018, 11:55

Well I saw!
Many thanks again!
Comment

Announcement

Comment

Comment

Comment

Comment

Comment