Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cmp simultaneous regressions: invalid syntax error.

    I am trying to figure out how the cmp program exactly works.

    However, I do not seem to manage to obtain the basic model and I don't know where it goes wrong.

    My model is:
    1. irrigation_dummy = croptype_dummy + ps1 + ps2+ps3 + ps4
    2. croptype_dummy = irrigation_dummy + ps1 + ps2+ps3 + ps4
    irrigation_dummy has 2 categories (probit) and croptype_dummy has 6 categories (mprobit). Both dependent variables are therefore discrete!

    I thought the STATA code should be this:
    cmp (irrigation_dummy =croptype_dummy# ps1 ps2 ps3 ps4) (croptype_dummy = irrigation_dummy# ps1 ps2 ps3 ps4), indicators($cmp_probit $cmp_mprobit) qui tech(dfp)
    But it does not work. The error I get is the one below. But I am very sure that the spelling of my variables is correct, so I do not know what I do wrong.

    Equation croptype_dummy not found.
    invalid syntax

    error . . . . . . . . . . . . . . . . . . . . . . . . Return code 111
    __________ not found;
    no variables defined;
    The variable does not exist. You may have mistyped the
    variable's name.
    variables out of order;
    You specified a varlist containing varname1-varname2, yet
    varname1 occurs after varname2. Reverse the order of the
    variables if you did not make some other typographical error.
    Remember, varname1-varname2 is taken by Stata to mean varname1,
    varname2, and all the variables in dataset order in between.
    Type describe to see the order of the variables in your dataset.
    __________ not found in using data;
    You specified a varlist with merge, but the variables on which
    you wish to merge are not found in the using dataset, so the
    merge is not possible.
    __________ ambiguous abbreviation;
    You typed an ambiguous abbreviation for a variable in your data.
    The abbreviation could refer to more than one variable. Use a
    nonambiguous abbreviation, or if you intend all the variables
    implied by the ambiguous abbreviation, append a `*' to the end
    of the abbreviation.

  • #2
    Since croptype_dummy has 6 possible outcomes, your multinomial probit model for it has 5 equations, for the utility of each alternative relative to the base alternative. When you add a # suffix you are referring to a latent linear variable. So which of the 5 croptype_dummy equations do you mean "croptype_dummy#" to refer to?

    Comment


    • #3
      Hi David! Thank you for intervening.

      I thought that automatically the equation for the lowest value of the dependent variable becomes the base alternative and that all the other equations would be displayed?

      So basically, in my case: a farmer has to choose both whether he will or will not irrigate, and which croptype he will choose. These decisions are interdependent. For instance, a farmer will not choose for rice if he cannot choose for irrigation. So if a farmer chooses for rice, there is a high probability he will also choose for irrigation. The logic also goes in the other direction. If a farmer is already irrigating, he might be more interested in crops that require irrigation, given the fact that the irrigation investments are already in place.

      So in my case, I am not really interested in one equation. I would like to see all different croptypes (or the probability of choosing all the different croptypes compared to the baseline). That would mean that in the first equation, one would model the probability of being 1 (namely irrigation instead of rainfed) dependent on the probability of being a certain croptype (and therefore not the baseline croptype).

      I am not sure if this answers your question?

      Thanks a lot in any case!

      Comment


      • #4
        Yes, the lowest value of croptype automatically defines the base alternative.

        If you want irrigation to depend on all the relative utility equations, then you'll need to include separate references to all of them. Honestly, I'll be interested to see if this works, because this is a very complicated set-up that I did not think about or design for.

        First run it without the cross-references:
        Code:
        cmp (irrigation_dummy =ps1 ps2 ps3 ps4) (croptype_dummy = ps1 ps2 ps3 ps4, iia), indicators($cmp_probit $cmp_mprobit) qui tech(dfp) iter(1)
        This will tell you what names cmp gives to all the mprobit equations; they can't all be the same, so cmp has to make them up.

        Then run the real model, referring to those equation names, with # suffixes, not to croptype_dummy#.

        Also, for tractability, you should start by imposing the IIA assumption, with the iia equation option.

        Possibly to get this to work, you'll need to switch to cmp's application-specific mprobit syntax.
        --David

        Comment


        • #5
          Hi David!

          That is disappointing. I had put quite some hope on cmp. It is a powerful program. Any other suggestions? (I am not sure what you mean with "switch to cmp's application-specific mprobit syntax")

          Anyhow, I ran the model you suggested, and received the names of the different equations (they were called _outcome_2_3 etc...)

          Code:
          cmp (irrigation_dummy =_outcome_2_3# ps1 ps2 ps3 ps4) (croptype_dummy = irrigation_dummy# ps1 ps2 ps3 ps4), indicators($cmp_probit $cmp_mprobit) qui tech(dfp)
          The error I recieved is: (I did only use a sample of the data and a lower number of croptypes to speed up the model estimation)

          cmp_lnL(): 3200 conformability error
          <istmt>: - function returned error
          Mata run-time error
          Mata run-time error
          r(3200);
          A matrix, vector, or scalar has the wrong number of rows and/or
          columns for what is required. Adding a 2 x 3 matrix to a 1 x 4
          would result in this error.

          Comment


          • #6
            Hi David!

            I was thinking: would it help if I say that my irrigation_dummy is not a dummy but a continuous variable from 0-100%? There is a way for me to calculate this. So in that case I would only have one dummy variable (crop type). I can drop some crop types so that I only keep the main crop types. (for instance 2 or by preference 3).

            Thanks a lot!

            Comment

            Working...
            X