Hi,
I am trying to use discrete choice model for my paper and I have the following two questions. Feel free to share your opinion.
Concern 1.
I am examining if customer’s initial interest has an impact on them buying the product.
- customer_ID : unique id for each customer.
- Product_ID : unique id for each product
- Initial_int : binary 1 if the customer said he/she has interest in the product.
- Purchase : DV, 1 if purchased.
- Max purchase : total number of products the customer purchased(control)
- Customer income : customer income (control)
customer_ID | product_ID | purchase | max_purchase | Initial_int | customer_income |
1 | A | 1 | 2 | 1 | 70 |
1 | B | 0 | 2 | 0 | 70 |
1 | C | 1 | 2 | 0 | 70 |
1 | D | 0 | 2 | 0 | 70 |
1 | E | 0 | 2 | 0 | 70 |
2 | A | 1 | 2 | 1 | 90 |
2 | B | 1 | 2 | 0 | 90 |
2 | C | 0 | 2 | 1 | 90 |
2 | D | 0 | 2 | 0 | 90 |
2 | E | 0 | 2 | 0 | 90 |
3 | A | 0 | 2 | 0 | 70 |
3 | B | 1 | 1 | 0 | 70 |
3 | C | 0 | 1 | 1 | 70 |
3 | D | 0 | 1 | 0 | 70 |
3 | E | 0 | 1 | 0 | 70 |
4 | A | 1 | 3 | 1 | 100 |
4 | B | 1 | 3 | 0 | 100 |
4 | C | 1 | 3 | 1 | 100 |
4 | D | 0 | 3 | 0 | 100 |
4 | E | 0 | 3 | 0 | 100 |
For this analysis I did the following:
cmset customer_ID product_IDThe problem is that majority of the customers bought more than 1 products.
cmclogit purchase initial_int, casevar (max_purchase customer_income)
According to page 13 here (https://www.stata.com/manuals/cmcmclogit.pdf#cmcmclogit ), it seems like I have to make observations (cmsets) for these purchases.
For example, for customer_ID #1, I make another five sets of observations for the second purchase…And for these second purchases, I identify them with the variable "purchase_number"
customer_ID | product_ID | Initial_int | purchase | Purchase_number |
1 | A | 1 | 1 | 1 |
1 | B | 0 | 0 | 1 |
1 | C | 0 | 0 | 1 |
1 | D | 0 | 0 | 1 |
1 | E | 0 | 0 | 1 |
1 | A | 1 | 0 | 2 |
1 | B | 0 | 0 | 2 |
1 | C | 1 | 1 | 2 |
1 | D | 0 | 0 | 2 |
1 | E | 0 | 0 | 2 |
Is adding new sets of observations the only way? Are there any options or commands that would make this process simpler? Any feedback would be helpful!
Concern 2.
Also, for clarifications, regarding the command below, do I include controls inside the parentheses or before?
- cmclogit purchase initial_int, casevar (max_purchase customer_income CONTROLS HERE? )
- cmclogit purchase initial_int CONTROLS HERE?, casevar (max_purchase customer_income)
Comment