Dear all,
first off, many thanks for the existence of this forum, I have found several solutions to problems I have already encountered when using Stata.
Unfortunately, the current problem does not fit the above description...
I'm using Stata 12, with a dataset containing the following information (leaving out unnecessary variables):
product_id
client_id
invoice_id
revenue
quantity
from
to
The variables from and to depict dates(e.g. 01jan2011) indicating the period of exploitation of a product (=licenses).
So an invoice contains details on which product is consumed for which period to which client_id, the number of units sold and the amount invoiced.
The periods of exploitation are mostly quarterly data (01jan2011 to 31mar2011), and I would like to format the dataset in quarters. This accounts for around 80% of the observations.
However I face 3 problems regarding data management:
1. some invoices cover several quarters of exploitation.
2. some invoices span 1 quarter, but start on another date (e.g. 21jan2011 to 20apr2011).
3. I need to spread/impute revenue and quantity of observations spanning several quarters over the to be generated quarterly observations.
This is how I proceeded up to now:
1. several quarters:
I just expand these obs, with imputed values to be calculated later on expand to-from+1, gen(expand)
2. mismatched quarters:
I have no idea to attach a mismatched quarter to the closest by quarterly observation. Could you help me out with this?
3. Impute quantity and revenue:
The easy, but wrong way is just to divide these quantities equally over the number of quarters we observe.
A better way would be to use information on the evolution of sales over time to impute the values (eg. sales are not spread equally over time, not even linearly declining over time, but even exponentially declining). The questions are: 1) how to use information on the evolution of sales over time to allocate sales across quarters if they span multiple quarters, and 2) which information to use?
I have been working on some other dataset, where I imputed missing values for observations at period t, using panel regression and multiple imputation, but here, I know the values, they do not have to be imputed, just allocated correctly to the quarters they span.
The goal is to do panel regressions on the evolution of sales and the effects of the Coase conjecture and cannibalization of varieties.
Many thanks in advance,
Glenn
first off, many thanks for the existence of this forum, I have found several solutions to problems I have already encountered when using Stata.
Unfortunately, the current problem does not fit the above description...
I'm using Stata 12, with a dataset containing the following information (leaving out unnecessary variables):
product_id
client_id
invoice_id
revenue
quantity
from
to
The variables from and to depict dates(e.g. 01jan2011) indicating the period of exploitation of a product (=licenses).
So an invoice contains details on which product is consumed for which period to which client_id, the number of units sold and the amount invoiced.
The periods of exploitation are mostly quarterly data (01jan2011 to 31mar2011), and I would like to format the dataset in quarters. This accounts for around 80% of the observations.
However I face 3 problems regarding data management:
1. some invoices cover several quarters of exploitation.
2. some invoices span 1 quarter, but start on another date (e.g. 21jan2011 to 20apr2011).
3. I need to spread/impute revenue and quantity of observations spanning several quarters over the to be generated quarterly observations.
This is how I proceeded up to now:
1. several quarters:
I just expand these obs, with imputed values to be calculated later on expand to-from+1, gen(expand)
2. mismatched quarters:
I have no idea to attach a mismatched quarter to the closest by quarterly observation. Could you help me out with this?
3. Impute quantity and revenue:
The easy, but wrong way is just to divide these quantities equally over the number of quarters we observe.
A better way would be to use information on the evolution of sales over time to impute the values (eg. sales are not spread equally over time, not even linearly declining over time, but even exponentially declining). The questions are: 1) how to use information on the evolution of sales over time to allocate sales across quarters if they span multiple quarters, and 2) which information to use?
I have been working on some other dataset, where I imputed missing values for observations at period t, using panel regression and multiple imputation, but here, I know the values, they do not have to be imputed, just allocated correctly to the quarters they span.
The goal is to do panel regressions on the evolution of sales and the effects of the Coase conjecture and cannibalization of varieties.
Many thanks in advance,
Glenn
Comment