Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating Variable with conditions and sum of other variables excluding observations

    Hello everyone,
    I am currently working on my bachelor thesis and have question regarding the generating of a new variable.
    In my data set I have player that produce an output.
    I wann to generate a variable that gives me an average output. This average should include only obersavtions with the same team and year as the player but shouldn't take the observations of the player itself.
    So I want the average: H/AB
    This is as far as I have come:

    gen teamsBA = H/AB if ((teamID == teamID) & (yearID == yearID))

    Obviously excluding H and AB of that player is not accounted for

    Thank you in advance for looking at this and I am happy to hear about any idea you have to solve my problem.

  • #2
    In addition to not excluding that player, your code will not work, because teamID == teamID and yearID == yearID are always true in every observation, so you will just be calculating the value of H/AB in each individual observation.

    I would approach this as follows:

    Code:
    by teamID yearID playerID, sort: assert  _N == 1
    by teamID yearID: egen totalH = total(H)
    by teamID yearID: egen totalAB = total(AB)
    gen teamsBA = (totalH-H)/(totalAB-AB)
    Note: This assumes that each player appears only once in each batch of observations for a given team and year. This assumption is tested in the first line of the code. If that assumption is incorrect, please post back with an example of your data, using the -dataex- command. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.


    Comment


    • #3
      Thank you so much!

      It is correct that each player only appears once so it worked perfectly.
      And thank you for the advice on the -dataex- command. I was wondering how I can include an example that is actually structured and easily understandable.

      Comment

      Working...
      X