Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keep observations if within range

    Hi everyone,

    I'm trying to manipulate one of my variables such that it only keeps observations that are within a certain date range. I keep on getting an "invalid syntax" error when I enter the command that I think I should be using, but I'm not sure where I have gone wrong with the syntax. I have had a look at the Stata manual and at some online resources but still don't quite know where I'm going wrong, as the code looks fine to me. Here is the command I used:

    Code:
    keep EBreceived if year>=2008 & month>=7
    Any advice would be appreciated.

  • #2
    You can't do this. It's not just invalid syntax, it makes no sense. You can keep or drop a variable in its entirety, but you cannot do so only in some observations, as the syntax attempts to do. Perhaps what you really mean is that you would like to replace values of EBCreceived with missing values if year >= 2008 & month >= 7. If so, the code for that is:

    Code:
    replace EBreceived = . if year >= 2008 & month >= 7
    As an aside, while you can blunder along doing things like this with separate year and month variables, it will probably make your life easier to combine them into a single month-year variable.

    Code:
    gen monthly_date = mofd(mdy(month, 1, year))
    format monthly_date %tm
    Then the code for eliminating values of EBreceived that are out of bounds becomes:

    Code:
    replace EBreceived = . if monthly_date >= tm(2008m7)
    Similarly, any other commands involving date arithmetic or comparisons will be much easier this way.

    Comment


    • #3
      Since Claire James wanted to keep the values with the specified condition, then the replace should be on the opposite:

      Code:
       
       replace EBreceived = . if !(year >= 2008 & month >= 7)

      Comment


      • #4
        Yes, of course. Sorry I got that reversed.

        Comment


        • #5
          Thanks Clyde and Sergiy, admittedly I'm still a beginner at Stata so the mistake wasn't obvious to me. I've used the commands that you guys suggested and they work fine. Just a point of clarification, does the "!" in the code below indicate that the conditions in the brackets should be reversed?
          replace EBreceived = . if !(year >= 2008 & month >= 7)

          Comment


          • #6
            Claire: Correct that ! is logical not and reverses truth and falsity. . See e.g.

            Code:
            help operators

            Comment


            • #7
              On #2: note that

              Code:
              gen mdate = ym(year, month)
              format mdate %tm
              is another way to proceed.

              Comment


              • #8
                I am trying to recode blood pressure as follows, and I am getting a mixed up result. Please help.
                gen bp_reading4=""
                replace bp_reading4="Hypotension" if systole4<90 & diastole4<60
                replace bp_reading4="Normal" if inrange (systole4, 90, 119) & inrange(diastole4, 60,79)
                replace bp_reading4="Elevated" if inrange (systole4, 120, 129) & inrange(diastole4, 60,79)
                replace bp_reading4="Stage 1 hypertension" if inrange (systole4, 130,139) | inrange (diastole4, 80,89))
                replace bp_reading4="Stage 2 hypertension" if inrange(systole4, 140,179) & inrange(diastole4,90,120))
                replace bp_reading4="Hypertensive crisis" if systole4>= 180 & diastole4>120

                Comment


                • #9
                  Hello Collins,

                  your question is not well posed. "I am trying to recode blood pressure as follows, " means that whatever follows constitutes the correct behavior by specification.

                  Applying some common sense, your code is akin to the table like this.
                  It specifies clearly for the last point (Hypertensive Crisis): "and/or" while you've picked only AND for some reason. Hence you are misclassifying people with a very large systolic or diastolic blood pressure measurements.
                  If there is any other confusion, perhaps you could elaborate, such as "Based on the table (src) I expect a person with s=... and d=... to be in the group ".......", but that person is placed into group "......" by my code."

                  Besides that there are some syntax problems, such as "inrange (" should be without a space, or the number of parentheses is unbalanced in stage1/2 lines.

                  Best, Sergiy

                  Comment

                  Working...
                  X