Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Mixlogit or Condition logit implementation of discrete choice experiment with multiple choice scenarios

    Hi! I am new to that statalist forum and Stata but working hard to get used to the software

    I implementing a discrete choice experiment to model cassava planting material alternative choice. In my questionnaire, I presented each respondents with 16 choice experiments or choice sets with each choice set having 2 alternatives or choices with an opt-out option. The explanatory variables are the attributes (11 in total) of cassava planting material with varying attribute levels that have randomly fitted between 2 alternative. With this, I am fitting a conditional logit model. In my data set, some explanatory variables are represent by dummy variables while others were categorical variables with up-to 3 categories.
    Since each choice experiment has 2 alternative options and an opt-out option, each choice set has 11 rows and each respondent was presented with 16 choice sets.



    (The sample of the original attribute and sample data is show below) Attributes of the cassava planting material
    Cassava stem attributes Alternative A Alternative B Alternative C
    Yield Low (<20 Tons/Ha) Moderate (20-30 Tons/Ha) High (>30 Tons/Ha)
    Disease tolerance Susceptible Tolerant
    Raw taste Bitter Sweet
    Cooked taste Bitter Sweet
    Mealiness Hard Mealy Watery
    Maturity Late (>18 months) Intermediate (13-18 months) Early (6-12 months)
    Seed availability Scarce Available Plenty
    In soil longevity Short term (Up to 1 year) Long term (Above 1 year)
    Shelf-life of stakes Low shelf-life High shelf-life
    Suitability in crop systems Suitable Not suitable
    Price 10,000 27,000 40,000
    Which one would you choose?
    1. Yes 0. No
    1. Yes 0. No 1. Yes 0. No
    Choice set
    Attribute Alternative 1 Alternative 2 Alternative 3
    Yield Low (<20 tons/ha) Low (<20 tons/ha) Opt out
    Disease tolerance Susceptible Susceptible
    Raw taste Bitter Bitter
    Cooked taste Sweet Sweet
    Mealiness Mealy Mealy
    Maturity Late (>18 months) Intermediate (13-18 months)
    Seed availability Available Plenty
    In soil longevity Short term (Up to 1 year) Short term (Up to 1 year)
    Self-life of stakes Short term Short term
    Suitability in crop system Suitable Suitable
    Price 10000 27000
    Question: Which alternative do you prefer? 1. Yes 0. No 1. Yes 0. No
    Sample data
    Respondent Choice_set Choice Yield_new Disease_tol Rawtaste Cookedtaste Mealiness Maturity_1
    1 1 1 3 1 1 1 2 3
    1 1 0 3 1 1 1 1 2
    1 2 0 1 0 0 1 3 3
    1 2 1 1 1 0 1 3 3
    1 3 1 3 1 1 1 2 3
    1 3 0 3 1 1 1 1 2
    1 4 0 2 0 1 0 1 2
    1 4 1 3 0 1 1 1 2
    1 5 0 3 0 1 1 1 3
    1 5 1 3 0 1 1 3 3
    1 6 1 2 1 0 1 3 3
    For clogit to work, I tried select choice set as the grouping variable but stata shows "variable Choiceset has replicate levels for one or more cases".
    clogit y Yield Disease_tol Rtaste Ctaste Mealines Maturity Seed_avail InSoil_long Shelflife Suit_crop_sys nprice , group(Choice_set )


    I want to establish the attributes preferred by the respondents including the level and later determine the willingness-to-pay and potential demand,
    Can someone clarify for me this?


  • #2
    You asked here why you are not getting a response. See section 17 in the FAQ for possible reasons. Perhaps your question is too vague. You can also increase the probability of a response by posting your data in a format that allows easy import into Stata. Please consider using dataex from SSC for this purpose.

    Comment


    • #3
      Stephen

      It’s hard to know exactly what’s causing the error based on the information you have provided, but my guess is that it is related to the definition of the Choice_set variable. Note that this variable needs to be a unique identifier for each choice set in the sample (i.e. it needs to take different values for the choice sets of each individual). Try running the following code:

      Code:
      egen newcsid = group(Respondent Choice_set)
      clogit y Yield Disease_tol Rtaste Ctaste Mealines Maturity Seed_avail InSoil_long Shelflife Suit_crop_sys nprice , group(newcsid)
      A couple of other comments:

      - Based on your follow-up post (http://www.statalist.org/forums/foru...implementation) the dependent variable y in the code above should probably be replaced with choice

      - I am not sure why you say that “Since each choice experiment has 2 alternative options and an opt-out option, each choice set has 11 rows…” In the example it looks like you have two rows of data for each choice set, which is correct if you exclude the opt-out.

      - Note that the FAQ asks you to tell us exactly what you typed. The error message you report is incompatible with the syntax as the spelling of Choice_set (Choiceset) differs between the two. This might seem pedantic, but the devil is always in the detail when trying to spot errors in code, so it pays to be precise.

      Arne

      Comment


      • #4
        Arne

        Thanks very for the instruction, i have run the command you provided and has worked.

        But i get this comment "note: multiple positive outcomes within groups encountered.note: 2021
        groups (4042 obs) dropped because of all positive or all negative outcomes
        "in the output window.
        This seems to be an error i have been struggling with, how can i eliminate this? Does this have bearing on my results.

        The choice sets contained an opt-out option "respondent will not buy if there are only the 2 alternatives available" .
        The opt-out option has no attribute levels, i want to include in the data. Below is sample data with 2 alternatives (Alt 1 & 2) and opt-out option (Alt 3).
        Is it correct include zeros for the attribute levels of opt-out option.

        input float Choicesetid byte(Respondent Choiceset Alt Choice Yield Distol Rtas Ctas Mealns Maturity Seedava Soillong Shelflife Suitcropsys nprice)
        1 1 1 1 1 3 1 1 1 2 3 2 1 1 1 -2
        1 1 1 2 0 3 1 1 1 1 2 3 1 1 0 -2
        1 1 1 3 0 0 0 0 0 0 0 0 0 0 0 0
        2 1 2 1 0 1 0 0 1 3 3 2 0 0 0 -1
        2 1 2 2 1 1 1 0 1 3 3 2 0 1 0 -2
        2 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0
        3 1 3 1 1 3 1 1 1 2 3 1 1 0 0 -2
        3 1 3 2 0 3 1 1 1 1 2 2 1 0 1 -2
        3 1 3 3 0 0 0 0 0 0 0 0 0 0 0 0
        4 1 4 1 0 2 0 1 0 1 2 1 1 0 1 -2
        4 1 4 2 1 3 0 1 1 1 2 3 1 0 1 -2
        4 1 4 3 0 0 0 0 0 0 0 0 0 0 0 0

        Thanks
        Stephen


        Comment


        • #5
          This suggests that there are other problems with your data setup, as presumably the respondents were asked to choose a single alternative. The messages suggest that in some choice sets more than one alternative was chosen and in others all/none of the alternatives were chosen. This is not something we can help you with – you need to carefully check your data, in particular those choice sets where the respondent is coded to have chosen more than one or no alternatives. This would be a good start:

          Code:
          bysort Respondent Choice_set: egen nchoice = total(choice)
          browse if nchoice != 1
          Regarding the opt-out see this thread: http://www.statalist.org/forums/foru...ice-experiment

          Good luck.
          Arne

          Comment


          • #6
            Thanks Arne

            Let me try to organize the data and perform the analysis.

            Am puzzled after using clogit for determining the preferred attributes and levels, how can estimate willingness-to-pay (WTP) for the attributes and consequently potential demand for the product.

            I appreciate your efforts.

            ​Stephen

            Comment


            • #7
              Dear Arne
              From your example from
              choice speed cost group id
              0 5 3 1 1
              1 8 4 1 1
              0 6 3 1 1
              0 3 2 2 1
              0 2 2 2 1
              1 5 4 2 1
              0 6 4 2 1

              . mixlogit choice cost, group(group) id(id) rand(speed)
              and

              .use http://fmwww.bc.edu/repec/bocode/t/traindata.dta
              .gen mprice = -price
              .mixlogitwtp y contract local wknown, group(gid) id(pid) price(mprice) rand(tod seasonal) nrep(500)

              I tried to fit a mixed logit model but it takes forever to generate the results and i have to break and start.The same thing happens again and again.
              global randvars " yield distol rtas ctas mealns maturity seedava soillong shelflife suitcropsys"
              mixlogit choice price1, rand($randvars) group(Choicesetid) nrep(500)


              Is there something am not getting right?

              Example generated by -dataex-. To install: ssc install dataex
              clear
              input float Choicesetid byte(Rid Choiceset Alt Choice Yield Distol Rtas Ctas Mealns Maturity Seedava Soillong Shelflife Suitcropsys nprice)
              1 1 1 1 1 3 1 1 1 2 3 2 1 1 1 -2
              1 1 1 2 0 3 1 1 1 1 2 3 1 1 0 -2
              1 1 1 3 0 0 0 0 0 0 0 0 0 0 0 0
              2 1 2 1 0 1 0 0 1 3 3 2 0 0 0 -1
              2 1 2 2 1 1 1 0 1 3 3 2 0 1 0 -2
              2 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0
              3 1 3 1 1 3 1 1 1 2 3 1 1 0 0 -2
              3 1 3 2 0 3 1 1 1 1 2 2 1 0 1 -2
              3 1 3 3 0 0 0 0 0 0 0 0 0 0 0 0
              4 1 4 1 0 2 0 1 0 1 2 1 1 0 1 -2
              4 1 4 2 1 3 0 1 1 1 2 3 1 0 1 -2
              4 1 4 3 0 0 0 0 0 0 0 0 0 0 0 0
              end

              The dataset example includes the opt-option

              I would follow this with estimating of willingness to pay using -mixlogitwtp-

              .global randvars " yield distol rtas ctas mealns maturity seedava soillong shelflife suitcropsys"
              . mixlogit y price1, rand($randvars) group(alt ) id(choiceset) nrep(500)


              I was not sure what to include in the id( ) and group( ) after generating the unique newcsid


              I appreciate your assistance!

              Stephen Angudubo

              Comment


              • #8
                Is Rid a respondent identifier? If so that should be your id variable, and the newcsid variable should be your group variable. I recommend that you read http://www.stata-journal.com/sjpdf.h...iclenum=st0133 and the mixlogit help file carefully. As a general advice try to build up the complexity of your model gradually by starting with a small number of random coefficients.
                Last edited by Arne Risa Hole; 17 Sep 2015, 02:31.

                Comment


                • #9
                  Thanks Arne, you have helpful.

                  It worked with about six attribute, but when i introduced the seventh attribute, the iterations run endlessly.
                  I tried specifying -iter() without any number in the brackets, it produced the estimates.

                  Is that procedure of introducing -iter() right or wrong? How can easily overcome this endless iterations as introduce more attributes?

                  Thanks

                  Comment


                  • #10
                    Non-convergence (endless iterations) could be an indication that there is not enough variation in the data to identify all of the parameters in the model. Specifying the iter() option does not solve the problem, as that simply tells Stata to stop the iterations after a certain number whether convergence has been achieved or not. The best solution is typically to simplify the model.

                    Comment


                    • #11
                      Dear Arne

                      Thanks for your response. Does simplifying involves changing the attribute levels to two levels (1,0) for those with three or four levels (0,1,2,3). Sorry to ask, an example will make best understand the case of simplifying the model.

                      Thanks

                      Stephen

                      Comment


                      • #12
                        By simplifying the model I mean having fewer random coefficients.

                        Comment


                        • #13
                          Dear Arne
                          Thanks a lot, it has worked, i appreciate your advice.
                          Discussing the results is becoming an issue. For example, the attribute yield had 3 levels and coded as follows: 1=High, 2=Low, 3=Moderate. How do
                          i include these levels to have a meaningful interpretation of the output.

                          From the example of including Opt-out option (Regarding the opt-out see this thread: http://www.statalist.org/forums/foru...ice-experiment), the yield level for option is Zeros (0).
                          Codes for levels of yield becomes; 0=None (opt-out option), 1=High, 2=Low, and 3=Moderate. I need to cater for all these in the analysis. Can you enlighten me this?

                          Thanks for your cooperation!

                          Stephen

                          Comment

                          Working...
                          X