Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing egen variable with mean

    Hi all,

    Is it possible to replace a value with the mean of other observations, as well as an if condition?


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float deal_id double patent_tar int class_tar float(cites5yr_tar overlap)
    1 7056957 521  7 .
    1 6491950 424  2 1
    1 6746691 424  0 1
    1 6020305 424  0 1
    1 6676967 424  0 1
    1 6596308 424  1 1
    1 7056494 424  2 1
    1 6155251 128  6 1
    1 6406715 424  1 1
    1 6818229 424  0 1
    1 5855884 424  1 1
    1 6524615 424  0 1
    1 7011848 424  0 1
    2 6165985 514  0 1
    2 6906059 514  1 1
    2 7410962 514  0 1
    2 7262184 514  0 1
    2 7074325 210  0 1
    2 7479378 435  0 1
    2 6300336 514  4 1
    2 7282325 435  0 1
    2 6001833 514  2 1
    2 7081451 514  3 1
    2 6407106 514  2 1
    2 7273860 514  0 1
    2 6069147 514  0 1
    2 5914330 514  1 1
    2 6648212 228  1 .
    2 6040303 514  0 1
    2 6204271 514  0 1
    2 6482820 514  3 1
    2 7041303 424  0 1
    2 7439235 514  0 1
    2 7309706 514  0 1
    2 7074822 514  0 1
    2 7592344 514  0 1
    2 6054461 514  2 1
    2 7427611 514  0 1
    2 6666171 119  1 .
    2 6930207 568  0 1
    2 6602880 514  2 1
    2 7368582 549  0 1
    2 7524833 514  0 1
    2 6854427 119  0 .
    2 6197198 210 21 1
    2 5977127 514  0 1
    2 6770649 514  0 1
    2 6946243 435  2 1
    2 6117879 514  2 1
    2 7534903 549  0 1
    2 7238470 435  0 1
    2 7122357 435  1 1
    2 7452875 514  0 1
    2 6195941  49  1 .
    2 7121069  54  0 .
    2 7220733 514  0 1
    2 5952327 514  0 1
    2 7087620 514  0 1
    2 6900322 546  0 1
    2 7244743 514  1 1
    2 7241770 514  0 1
    2 6566369 514  1 1
    3 7439253 514  0 1
    3 7612087 514  0 1
    3 7232834 514  7 1
    3 7232833 514  4 1
    4 6753158 435  0 1
    4 6753151 435  1 1
    4 6242175 435  1 1
    5 7063943 435  3 1
    5 6492160 435  2 1
    5 6946546 530  1 1
    5 6291650 530 10 1
    5 6342588 530  4 1
    5 7074557 435  0 1
    5 6489123 435  2 1
    5 6140471 530  6 1
    5 6492497 530  3 1
    5 6225447 530  6 1
    5 6180336 435  6 1
    5 6827925 424  0 1
    6 7241863 530  0 1
    6 5874298 435  2 1
    6 6362231 514  3 1
    6 6156539 435  0 1
    6 6534289 435  0 1
    6 5763494 514  3 1
    6 6617358 514  0 1
    6 6383527 424 24 1
    6 6342532 514  2 1
    6 6432656 435  0 1
    6 6796967 604  0 1
    6 6211244 514  0 1
    6 6031003 514  1 1
    6 7262280 530  1 1
    6 7112595 514 10 1
    6 6521667 514  0 1
    6 5688764 424  0 1
    6 5674846 514  0 1
    6 6660753 514 18 1
    end
    Above is an example of my data. I want to get an average of the cites5yr_tar observations, given that overlap == 1. What I've tried is this (for deal_id = 1)

    Code:
    egen overlapqual5 = mean(cites5yr_tar) if deal_id == 1 & overlap == 1
    This does give me what I want. However, instead of creating a seperate egen for each deal_id (I have over 200 of them in my whole dataset), I wanted to just replace it. I tried

    Code:
    replace overlapqual5 = mean(cites5yr_tar) if deal_id == 2 & overlap == 1
    But got the unknown function mean () error.

    Is there any way to work around this in stata, without creating a seperate egen for each variable?

    Thanks,
    Chris

  • #2
    You need to use the -by- option in your egen function:

    Code:
    bysort deal_id: egen overlapqual5 = mean(cites5yr_tar) if overlap == 1
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      egen functions in Stata can only be used with egen; so not with replace.

      This may help you with technique:

      Code:
      egen mean = mean(whatever), by(deal_id overlap)
      That gives you a new variable, which perhaps is what you may want to use with replace somehow.

      Comment

      Working...
      X