Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating New Variables Subject to Multiple Conditions

    Hello All,

    I am learning STATA and have not been able to find the answer to the following question online:

    What syntax do I use in Stata to generate a variable that requires multiple conditions? Here is what I'm trying to use and it's not working:

    bysort id visit_num: generate pre_hepb3_compliant = 1 if age_at_visit >= 176 & if age_at_visit < 570 & pre_hepb_3 ==“Y
    replace pre_hepb3_compliant = 0 if age_at_visit >= 176 & if age_at_visit < 570 & pre_hepb_3 != “Y


    I am getting an error message that says: “Y” invalid name

    I want to create a new variable (pre_hepb3_compliant)
    I want it to equal 1 if
    child's age is between 176 days and 570 days
    AND
    child has received the pre_hepb_3 vaccination

    So there are three conditions for each participant before assigning a value, 1 or 0 to the new variable:
    1. age at clinic visit is greater than or equal to 176 days old
    2. age at clinic visit is less than 570 days old
    3. vaccination dose received as indicated by an existing variable (pre_hepb_3 is equal to Y)

    Only then do I want a 1 for the newvar, "pre_hepb3_compliant". Otherwise, I want the newvar "pre_hepb3_compliant" to equal 0.

    Ideas please? I would like to create a whole series of variables that require multiple conditions and so your help will be very much appreciated.

    Dayna Matthew

  • #2
    Code:
    gen byte pre_hepb3_compliant=inrange(age_at_visit,176,570) & pre_hepb_3=="Y"
    this will give you a 0/1 variable for each observation; note, however, that this assumes that either there are no missing data or that you want missing to fulfill/fail your conditions (e.g., be equal to 0 if age_at_visit is missing); if that is not what you want, and there are missing values; you can follow the above with, e.g.,
    Code:
    replace pre_hepb3_complaint = . if age_at_visit==. | pre_hepb_3==""

    Comment


    • #3
      The help for if gives examples of compound conditions.

      A detail missed in #1 is that the word if itself appears at most once. It is never repeated.

      Also, the function missing() allows a mix of numeric and string arguments. missing() yields 1 (true) if any of its arguments is missing.

      Code:
       replace pre_hepb3_complaint = . if missing(age_at_visit, pre_hepb_3)
      Last edited by Nick Cox; 14 Jul 2017, 02:16.

      Comment


      • #4
        Thank you very much. I will read the help for if now, Nick. Rich, since I am still getting error message I wonder whether I should insert a value after "=" before "inrange"? Also, in second line of the code, would you recommend that I replace with value "0" if I want 0/1 variable for each operation?

        Comment


        • #5
          Nevermind, Rich - I think I get it now. Thank you

          Comment


          • #6
            Hi,

            I have a kind of same question;
            I have a datalist where I want to count the number of men working in an organization in a specific month. For every man there is a draw, since there are different roles per year.

            In the dataset I have information about the specific organization; Organization number, about the employee; Employee number, about the gender; Man=1 or Man=0 and about the month this was measured.
            I want to calculate the total number of men working in a specific organization in a specific month.

            I already tried: egen nman = total(man==1), by (organizationnr)

            But then the number of nman is the same in every month; since Stata calculates the total of man=1, also in the months where there is no man working in the organization.
            The second problem here is that Stata sees no difference in the employee number; so in this specific example; stata gives 6 for nman for all the months, but this should be 1 in some months, since there is only 1 man working in the organization, the only thing is that there is data for this man for 6 months available.

            If I try to use egen nman = total(man==1), by (organizationnr & month) I get an error. I think it is not possible to use two "by" commands?

            Can anyone help me further with this? How do I collapse the data based to organization and month level? Would it help if I create a dummy at month level?

            Comment


            • #7
              Have you tried:
              Code:
              bysort organizationnr month : egen nman = total(man==1)

              Comment

              Working...
              X