Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to eliminate special cases within a regression command

    Hi Stata folks, some of my variables have values in which the respondents said "don't know". I'm running a regression where I want to exclude cases in which the respondents said "don't know". I know I could just do drop if "variable" == "don't know" but I'd like to save time and just ask the regression command to exclude such cases as it does to missing values. So here's a shot at what I'm trying to do:
    probit q25c i.q101 q91 q6b if democracy == 0 || drop if any observation == "don't know"

    The "if democracy ==0" command is just to limit the sample to non-democracies.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(q6b q25a q91 q101) float(Idmission50 democracy)
    3 0 . 1 1 1
    4 1 . 2 1 1
    3 1 . 2 1 1
    2 0 . 2 1 1
    1 0 . 1 1 1
    4 0 . 1 1 1
    3 0 . 2 1 1
    4 0 . 2 1 1
    3 0 . 1 1 1
    4 0 4 2 1 1
    4 0 . 1 1 1
    2 2 . 2 1 1
    4 0 4 1 1 1
    2 0 3 2 1 1
    2 2 . 2 1 1
    4 0 . 1 1 1
    3 0 4 2 1 1
    4 2 . 1 1 1
    1 0 . 1 1 1
    2 0 . 2 1 1
    3 0 . 2 1 1
    9 2 . 1 1 1
    3 1 . 2 1 1
    3 0 . 1 1 1
    3 0 . 2 1 1
    4 0 3 2 1 1
    4 0 3 2 1 1
    3 0 . 2 1 1
    5 0 . 1 1 1
    1 0 4 2 1 1
    4 0 3 2 1 1
    5 2 4 1 1 1
    1 0 4 2 1 1
    3 0 3 2 1 1
    3 1 . 2 1 1
    2 0 4 2 1 1
    3 0 . 1 1 1
    3 0 . 2 1 1
    2 0 . 1 1 1
    3 0 . 1 1 1
    3 2 . 1 1 1
    4 1 . 1 1 1
    3 0 . 2 1 1
    3 2 . 1 1 1
    3 0 4 2 1 1
    3 0 . 2 1 1
    2 0 . 1 1 1
    4 0 3 1 1 1
    4 2 . 1 1 1
    3 0 . 2 1 1
    4 0 3 1 1 1
    4 0 . 2 1 1
    4 0 . 2 1 1
    3 0 . 2 1 1
    3 0 . 1 1 1
    3 0 . 2 1 1
    3 2 1 2 1 1
    2 2 . 1 1 1
    2 2 . 1 1 1
    4 2 3 1 1 1
    3 1 . 2 1 1
    3 1 . 1 1 1
    3 0 . 2 1 1
    3 1 1 1 1 1
    4 1 3 1 1 1
    3 1 . 1 1 1
    4 0 . 2 1 1
    2 0 4 1 1 1
    4 0 1 1 1 1
    4 0 4 1 1 1
    2 0 4 2 1 1
    3 1 . 1 1 1
    4 0 4 1 1 1
    4 0 4 2 1 1
    3 0 4 1 1 1
    3 0 4 2 1 1
    4 0 4 1 1 1
    4 2 . 1 1 1
    3 0 . 2 1 1
    3 1 . 2 1 1
    4 0 . 1 1 1
    4 0 . 1 1 1
    3 2 . 1 1 1
    3 0 . 1 1 1
    2 0 4 2 1 1
    3 0 1 1 1 1
    4 1 . 2 1 1
    4 0 3 1 1 1
    3 0 1 1 1 1
    4 1 . 1 1 0
    3 0 . 1 1 0
    2 0 . 2 0 0
    2 0 . 1 0 0
    3 0 . 2 0 0
    3 3 4 2 0 0
    2 1 . 2 0 0
    4 0 4 1 0 0
    3 0 . 2 0 0
    4 0 4 2 1 0
    3 3 1 1 1 0
    end
    label values q6b LABI
    label def LABI 1 "much worse", modify
    label def LABI 2 "worse", modify
    label def LABI 3 "same", modify
    label def LABI 4 "better", modify
    label def LABI 5 "much better", modify
    label def LABI 9 "don't know", modify
    label values q25a LABP
    label def LABP 0 "never", modify
    label def LABP 1 "only once", modify
    label def LABP 2 "a few times", modify
    label def LABP 3 "often", modify
    label values q91 q91
    label def q91 1 "not at all important", modify
    label def q91 3 "somewhat important", modify
    label def q91 4 "very important", modify
    label values q101 LABAR
    label def LABAR 1 "male", modify
    label def LABAR 2 "female", modify

  • #2
    Recode the "don't know value(s)" on the variable(s) to some missing value code before you run the regression command. You need not save the data file with these recoded values. Is there some reason you don't want to do this?
    Code:
    recode TheListOfVariablesYouWant (9 = .a)
    probit ....
    See: -help missing values-; -help recode-.

    Comment


    • #3
      not sure where the confusion comes from as you appear to understand the "if" qualifier but try this:

      Code:
      . probit q25c i.q101 q91 q6b if democracy == 0 & q6b!="don't know":LABI
      this can easily be extended to other variables in exactly the same way (but be sure to reference the actual value label for that variable)

      note that your regression cannot be estimated on your example data as the apparent outcome variable was not included in your data example

      edit: crossed with #2

      Comment


      • #4
        Thanks for the responses. Mike Lacy the reason is that I'm exploring multiple variables for a model. I want to be able to quickly run an analysis on a variable without having to recode it and all that first until I'm certain it's a useful variable. Rich Goldstein , the confusion is, instead of specifying the condition for just one variable like you did for q6b, I want the condition to apply to all variables. So instead of saying "if q6b != "don't know"" I would like to say, "if all the variables in the model != "don't know"" or something of that sort. By the way what's the purpose of the ":LABI"? I looked it up but couldn't find anything.

        Comment


        • #5
          your goal is still unclear to me - do you want to exclude only if all variables are equal to "don't know" or do you want to exclude any observation for which any of the variables are equal to "don't know" - I had assumed the latter and told you to just extend the "if" conditions using "&"s; as far as the reference to "LABI" - that is the name of the value label for that variable and, to me, it is clearer to exclude based on something easily readable rather than refer to the numeric code for a particular value (which I might not remember); the example you show only has "don't know" for that one variable and "don't know" doesn't even exist in the labels defined for the other variables

          Comment


          • #6
            I agree with Rich here about clarity. I'm thinking that what you want should be very easy (one-liner, probably), so I suspect you might be missing something useful and easy.

            Comment


            • #7
              Thanks, folks. I went ahead with the initial suggestion and just used the & option.

              Comment

              Working...
              X