calculating an average across observations which have common characteristics

Dan Daugaard

Join Date: Jul 2018

Posts: 61
#1

calculating an average across observations which have common characteristics

28 May 2019, 15:53

I am attempting to calculate an average (mean) across observations which have common characteristics. It is an element of an event study looking at fund flows and I am using Stata 15.
Specifically, I am averaging “ff” where the variables with common stub “event” are -2, -1, 0, 1 or 2. The result should be a list of the average “ff” for
The data file is as follows

Code:

* Examples generated by -dataex-. To install: ssc install dataex clear input long id float(month ff event_1 event_2) 2708 657 -.005559 -1 -3 2708 658 -.0023612394 0 -2 2708 659 .0009129993 1 -1 2708 660 -.0016764477 2 0 2708 661 -.0025546604 3 1 2709 657 -.005559 -1 -3 2709 658 -.0023612394 0 -2 2709 659 .0009129993 1 -1 2709 660 -.0016764477 2 0 2709 661 -.0025546604 3 1 end format %tmCCYY_Mon month

The result should look like

Code:

clear input float (diff ave_ff) -2 -0.0025 -1 -0.00025 0 -0.0035 1 0.0005 2 -0.0045 end

The “diff” variable being the required range from the variables with event stubs (event*).
My initial plan of attack is to (1) sort by the variables with event stubs (event*),
(2) produce new observations from each of the event* variables,
(3) sort by the variables with event stubs (event*),
(4) remove observations with event* outside the -2 to 2 range,
(5) collapse to averages for each
But pls indicate if you think there is a more efficient/precise approach. Thank you, Dan
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#2

28 May 2019, 16:27

I have no idea what you mean by "stub "event"" being -2 -1 0 1 or 2. You have two variables, event_1, and event_2. Both of them sometimes take on values in this range. Do you want the average when both are in this range, or if one of them is? Or something else. You refer to a new variable, diff, which you define as "the required range from the variables with event stubs (event*)," which strike me as about as obscure as one can get.
Comment
Dan Daugaard

Join Date: Jul 2018

Posts: 61
#3

28 May 2019, 18:37

Sorry for the confusion Clyde.
I am attempting to identify (and average) values from the ff variable (in #1) when it corresponds to an event variable (ie event_1, and event_2) of -2 and then a value of -1, 0, 1 and 2.
Does that make more sense? thank you for reading my postings, Dan

Last edited by Dan Daugaard; 28 May 2019, 19:03.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30060
#4

28 May 2019, 19:58

I'm afraid I'm still confused. There are two event variables. I don't understand how these two variable relate to the new variable -diff-. I've tried to "reverse engineer" it, but I cannot discern what values of ff went in to the values of ave_ff you show. Perhaps if you explained how you calculated the variable diff in your example of desired output, that would help. Please do that in very concrete, specific terms. Say things like "First I identified all observations in which mention specific logical condition involving the variables event_1 and event_2. For those observations I calculated diff = give a specific formula for diff in terms of other variables in the data set. Then I calculated the average value of ff among all observations where give a specific logical condition involving diff and/or the other variables."
Comment
Dan Daugaard

Join Date: Jul 2018

Posts: 61
#5

29 May 2019, 20:39

Thank you for considering my problem Clyde and sorry for the confusion. It was a pretty clumsy setup. As well as following your more logical approach to introducing the relationship between the variables, I should have also used better names for those "event" variables (eg something representing the "difference between the month of ff observation and the date of event 1" etc).

I have received great advice to reorganise the data structure which makes the problem easier to get my mind around. I'll do that and move on to the next steps and hopefully word my next question with greater clarity. Thank you, Dan
Comment

Announcement

calculating an average across observations which have common characteristics

Comment

Comment

Comment

Comment