Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conditional Group Means

    Hi, Suppose, I have the following dataset:

    ID GrouID X1 X2
    1 1 7 10
    2 1 7 100
    3 1 11 200
    4 1 11 100
    5 1 50 10
    I want to generate three variable as follows:
    • Variable 1: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X2i = X2j. Example: for ID=1, the new variable would take a value of 100.
    • Variable 2: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X2i < X2j. Example: for ID=1, the new variable would take a value of (200+100+10)/3.
    • Variable 3: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X2i < X2j. Example: for ID=1, the new variable would take a value of (10+100+200+100)/4.
    I appreciate your help a lot!

  • #2


    Sorry, but I can only understand this by changing all the definitions! In particular you write X2 sometimes where you mean X1 and I think your second inequality should be >= not >.

    Note that the text for the second two definitions is identical, so there is some error there.

    Variable 1: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X1i = X1j. Example: for ID=1, the new variable would take a value of 100.

    Variable 2: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X1i > X1j. Example: for ID=1, the new variable would take a value of (200+100+10)/3.

    Variable 3: For each individual i, the value of this variable is the average over X2 for all j≠i, given that X1i >= X1j. Example: for ID=1, the new variable would take a value of (10+100+200+100)/4.

    Your data example can be used but in general we mean what we ask in asking that you use dataex (SSC) to show data.

    With rangestat (SSC) I get these results Here the intervals 1 1000 and 0 1000 are arbitrary ways to get values in the right intervals. You might need to change 1 or 1000.

    Code:
    clear
    input ID    GroupID    X1    X2
    1    1    7    10
    2    1    7    100
    3    1    11    200
    4    1    11    100
    5    1    50    10
    end
    
    rangestat mean1 = X2, interval(X1 0 0) excludeself by(GroupID)
    
    rangestat mean2 = X2, interval(X1 1 1000) excludeself by(GroupID)
    
    rangestat mean3 = X2, interval(X1 0 1000) excludeself by(GroupID)
    
    list
    
         +-----------------------------------------------------+
         | ID   GroupID   X1    X2   mean1       mean2   mean3 |
         |-----------------------------------------------------|
      1. |  1         1    7    10     100   103.33333   102.5 |
      2. |  2         1    7   100      10   103.33333      80 |
      3. |  3         1   11   200     100          10      55 |
      4. |  4         1   11   100     200          10     105 |
      5. |  5         1   50    10       .           .       . |
         +-----------------------------------------------------+
    For the original announcement of this program see http://www.statalist.org/forums/foru...s-within-range

    Search the forum for other mentions.

    Comment


    • #3
      Two of the definitions are still wrong. That's me to blame too.

      In this version I have made the notation more Stata-like.

      Variable 1: For each individual i, the value of this variable is the average over X2 for all ji, given that X1[j] = X1[i]. Example: for ID=1, the new variable would take a value of 100.

      Variable 2: For each individual i, the value of this variable is the average over X2 for all ji, given that X1[j] > X1[i]. Example: for ID=1, the new variable would take a value of (200+100+10)/3.

      Variable 3: For each individual i, the value of this variable is the average over X2 for all ji, given that X1[j] >= X1[i]. Example: for ID=1, the new variable would take a value of (10+100+200+100)/4.

      The examples are crucially helpful in indicating what is wanted.

      Comment


      • #4
        Sorry for messing up all the definitions and for not using dataex (unfortunately, I don't know how to edit my initial post)! What you provided is exactly what I was looking for: Thank you very much!

        Comment


        • #5
          Thanks for the closure.

          Editing the original post is only possible within 1 hour of first posting. A good reason for that restriction is that such editing might make nonsense of much of the following discussion.

          Comment

          Working...
          X