Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Determine average of a series from data

    Hello dear stata user,
    as a newbie in this field, i am looking for help for my research.
    I have a set of data for the years 1950-2002, which I have divided into the 3 subperiods. The first one is from 1950-1964 (fy = 1).
    At the same time the data gets information about a company (firmid), as well as earnings and dividend payments from the previous year, which I need for an OLS regression later.
    I have already written down the commands and run the regression, but for the first two periods the values are too low (I am trying to recreate an existing study in a paper, so I notice this) and for the third subperiod (1984-2002) my values have shot up completely, which should not happen.
    I have therefore now the assumption that this happens because I have for each company for each year an observation and had regressed this all together.

    For this reason I have considered whether the results are perhaps different if I summarize the several observations of a company for a subperiod. So I want to calculate the data for dividends and earnings for company 1 in the period 1950-1964 as an average value.
    Example sample:

    Firmid | earning | dividend | fy
    1 | 8 | 6 | 1
    1 | 1 | 4 | 1
    1 | 7 | 9 | 1
    1 | 4 | 8 | 1
    1 | 6 | 3 | 1
    1 | 1 | 0 | 1
    1 | 3 | 1 | 1


    How do I do this?
    Do I generate a new variable with the egen command and take the average of all values or each value that have the same firmid and the same subperiod?
    What does that look like and how can I use the if command to enter that so that the firmid must be the same? I had thought of "if firmid == same" but that seemed wrong.

    Thanks for your help in advance.

  • #2
    Originally posted by Steffen Scheifele View Post
    Hello dear stata user,
    as a newbie in this field, i am looking for help for my research.
    I have a set of data for the years 1950-2002, which I have divided into the 3 subperiods. The first one is from 1950-1964 (fy = 1).
    At the same time the data gets information about a company (firmid), as well as earnings and dividend payments from the previous year, which I need for an OLS regression later.
    I have already written down the commands and run the regression, but for the first two periods the values are too low (I am trying to recreate an existing study in a paper, so I notice this) and for the third subperiod (1984-2002) my values have shot up completely, which should not happen.
    I have therefore now the assumption that this happens because I have for each company for each year an observation and had regressed this all together.

    For this reason I have considered whether the results are perhaps different if I summarize the several observations of a company for a subperiod. So I want to calculate the data for dividends and earnings for company 1 in the period 1950-1964 as an average value.
    Example sample:

    Firmid | earning | dividend | fy
    1 | 8 | 6 | 1
    1 | 1 | 4 | 1
    1 | 7 | 9 | 1
    1 | 4 | 8 | 1
    1 | 6 | 3 | 1
    1 | 1 | 0 | 1
    1 | 3 | 1 | 1


    How do I do this?
    Do I generate a new variable with the egen command and take the average of all values or each value that have the same firmid and the same subperiod?
    What does that look like and how can I use the if command to enter that so that the firmid must be the same? I had thought of "if firmid == same" but that seemed wrong.

    Thanks for your help in advance.
    try
    Code:
    bys Firmid: egen mean_dividend= mean(dividend)

    Comment

    Working...
    X