Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to properly define a treatment variable

    Hi everybody,

    i am currently working on my masterthesis and do need some help.

    To summarize my situation:
    I want to analyse the effects of an divorce on happiness. As i wanted to define the treatment variables for a divorce, i got confused on how to handle the following situation:

    In the dataset there are 2 variables relevant for that situation.
    - "partner" - which is binary variable (which is basically 1 if you have partner and 0 if you are single)
    - "partnr" - which is either a positive number or takes on the negative value of (-2)

    Whenever the "partnr" takes on a positive value, "Partner" will take on the Value 1. Otherwise the person will be considered as "Single"

    This is my command for defining the treatment : "separtion from partner in the second year"

    Code:
        gen sep=1 if           (partner==1 & (f1.partner==1) & f1.partnr==partnr & f2.partner==1 & f2.partnr!=partnr) | ///
                                       (partner==1 & (f1.partner==1) & f1.partnr==partnr & f2.partner==0 & f2.partnr!=partnr) 
        
        replace sep=0 if    (partner==1 & (f1.partner==1) & f1.partnr==partnr & f2.partner==1 & f2.partnr==partnr) | ///
                                      (partner==1 & (f1.partner==1) & f1.partnr!=partnr & f2.partner==1 & f2.partnr==partnr) | ///
                                      (partner==1 & (f1.partner==1) & f1.partnr!=partnr & f2.partner==1 & f2.partnr!=partnr) | ///    
                                      (partner==1 & (f1.partner==1) & f1.partnr!=partnr & f2.partner==0 & f2.partnr!=partnr) | ///
                                      (partner==1 & (f1.partner==0) & f1.partnr!=partnr & f2.partner==1 & f2.partnr==partnr) | ///
                                       (partner==1 & (f1.partner==0) & f1.partnr!=partnr & f2.partner==1 & f2.partnr!=partnr) | ///
                                       (partner==1 & (f1.partner==0) & f1.partnr!=partnr & f2.partner==0 & f2.partnr!=partnr)
    After i use some command to restrict the sample (eliminating obervations with specific age etc.) i ran count if sep==1

    and i also ran the command

    count if (partner==1 & (f1.partner==1) & f1.partnr==partnr & f2.partner==1 & f2.partnr!=partnr) | (partner==1 & (f1.partner==1) & f1.partnr==partnr & f2.partner==0 & f2.partnr!=partnr)

    Both of those commands generate a different output and my question is: Shouldn't both commands give me the same amount of obersvations since it is the same condition? (especially after restricting the sample)

    I hope that i asked in an understandable way.

    Thank you very much in advance.

  • #2
    You'll increase your chances of a helpful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    You'll often find it easier to do one statement with a logical rhs instead of two to create dummies. Something like:
    g sep=(partner==1 & f1.partner==1)

    You've a long list of conditions. I can't see an obvious problem so let me suggest a way to diagnose this. First, it is likely that your replace is changing some of the 1's to 0's so just run the generate without the replace and see if you get the same results. It might be helpful to generate two variables - one for the 1's and one for the 0's and see if there is overlap. If not, then look carefully at the specific observations and try to see what is going on.

    Comment


    • #3
      I generated 2 variables - one for 1's and one for 0's. Indeed there was an overlap and it helped me solving the problem.

      Thank you very much.

      Comment

      Working...
      X