forvalues and "no observations" problem

Ale Duccio

Join Date: Nov 2017

Posts: 4
#1

forvalues and "no observations" problem

17 Nov 2017, 05:06

Hello everyone,
I need help with the following code. I am having problems in running the following forvalues command in my Event Study about M&A:

forvalues i=1(1)`r(max)' {
l id event_id if id==`i' & dif==0
qui reg ret market_return if id==`i' & estimation_window==1
predict p if id==`i'
replace predicted_return = p if id==`i' & event_window==1
drop p
}
*

Everytime I launch this code that is aimed at creating an expected, market model return according to the data I have about stock returns and market returns, I get an error message that says I have "no observations" after id=43 (the total number of idea is roughly 1800).
At id 43 I have a column of "NA" data, so I guess that's the problem.
I have tried to apply the "capture" command in front of "reg" and in front of "predict" and "replace" too, but things still don't work.
Do you have any ideas about how to override or avoid this issue?
Thanks in advance to anyone who will reply.

Alessandro Duccio
Tags: None
Dave Airey

Join Date: Apr 2014

Posts: 398
#2

17 Nov 2017, 07:19

Is the second line of your code "Isid ..." and not "I id ..."? If it's isid, would you not get an error at that code line if all your IDs are NAs?
Comment
Ale Duccio

Join Date: Nov 2017

Posts: 4
#3

17 Nov 2017, 10:26

Hi Dave, thanks for your reply. The second line is "list", not Isid. Sorry for the imprecision. what do you think about the code? Feel free to ask if you need further details.

Alessandro
Comment
Dave Airey

Join Date: Apr 2014

Posts: 398
#4

17 Nov 2017, 10:48

You should search through the archives directly or via Google. I did and found "capture noisily" solves your problem in a foreach loop:

Code:

clear sysuse auto recode rep78 (1 2=1) (3=2) (4=3) (5=4), generate(repair) replace repair = . if repair==1 forvalues i = 1(1)4 { capture noisily regress weight length if repair==`i' }
Comment
Ale Duccio

Join Date: Nov 2017

Posts: 4
#5

17 Nov 2017, 11:01

Again, thank you Dave. I tried to put that command in the code as you suggested but it did not solve the problem. It showed all the results of the regressions run, but in the end it did not allow the cycle to go beyond id 43 with the calculations.
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

17 Nov 2017, 12:54

You need to capture the error for the regress command AND only proceed if there was no error. To give readers a bit of context, you appear to be using code from the Princeton University Library on event studies using Stata. You can find the data preparation page here and the code you use in #1 here. I think that the instructions are a bit dated so I have reworked these to use modern Stata syntax for the data preparation. This requires two datasets that can be downloaded to Stata's current directory using:

Code:

copy http://dss.princeton.edu/sampleData/eventdates.dta eventdates.dta
copy http://dss.princeton.edu/sampleData/stockdata.dta stockdata.dta

The code below follows the instructions for using training days:

Code:

use "eventdates.dta", clear
by company_id: gen eventcount=_N
by company_id: keep if _n==1
sort company_id
keep company_id eventcount 
save "eventcount.dta", replace

use stockdata, clear
merge m:1 company_id using "eventcount.dta", keep(match) nogen

expand eventcount
drop eventcount
bysort company_id date: gen set = _n
isid company_id set date, sort
save "stockdata2.dta", replace

use "eventdates.dta", clear
by company_id: gen set = _n
sort company_id set 
save "eventdates2.dta", replace

use "stockdata2.dta", clear
merge m:1 company_id set using "eventdates2.dta", keep(match) nogen

egen group_id = group(company_id set)
save "raw_data2use", replace

The next step is to clean-up the data and calculate the event and estimation window. As per the instructions at the end of the data preparation page, group_id is used in lieu of company_id:

Code:

use "raw_data2use.dta", clear
rename company_id company_id0
isid group_id date, sort

* the number of days from the observation to the event date using trading days
by group_id: gen tday = _n
by group_id: egen etday = total(tday / (date == event_date))
gen dif = tday - etday if etday != 0

* define event window
by group_id: gen event_window = inrange(dif, -2, 2)
by group_id: egen count_event_obs = total(event_window)

* define estimation window
by group_id: gen estimation_window = inrange(dif, -60, -31)
by group_id: egen count_est_obs   = total(estimation_window)

drop if count_event_obs < 5
drop if count_est_obs < 30

save "data2use.dta", replace

Here's how to estimate Normal Performance using runby (from SSC). To replicate the error that you describe in #1, I insert missing data for id 43. With runby, if the user's program terminates with an error, no result is saved so the my_NP program includes an overall capture statement. This way, the program will stop at the first error but the by-group's data observations will remain.

Code:

* Estimating Normal Performance using runby (from SSC)
clear all
use "data2use.dta"

* create a problem for id 43
egen id = group(group_id)
replace market_return = . if id == 43

program my_NP
    capture {
        reg ret market_return if estimation_window == 1 
        predict pwanted if event_window
    }
end
runby my_NP, by(group_id)

With the results still in memory, here's how you would replicate the results using the Princeton's approach:

Code:

gen predicted_return=.
sum id, meanonly
local N = r(max)

forvalues i=1(1)`N' { 
    l id company_id if id==`i' & dif==0
    cap noi reg ret market_return if id==`i' & estimation_window==1 
    if _rc == 0 {
        predict p if id==`i'
        replace predicted_return = p if id==`i' & event_window==1 
        drop p
    }
}  

assert pwanted == predicted_return

Needless to say, I think the runby approach is much simpler and will be more efficient, particularly if the number of by-groups is large.

Comment

Dave Airey

Join Date: Apr 2014

Posts: 398
#7

17 Nov 2017, 13:03

I downloaded runby. Neat!
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

17 Nov 2017, 13:16

Upon further reflection, if execution time is a concern, you will get there much faster using rangestat (also from SSC):

Code:

clear all
use "data2use.dta"

* create a problem for id 43
egen id = group(group_id)
replace market_return = . if id == 43

* the rangestat solution
gen one = 1
rangestat (reg) ret market_return, interval(estimation_window one one) by(group_id)
gen pwanted2 = market_return * b_market_return + b_cons if event_window

* the runby solution
program my_NP 
    capture {
        reg ret market_return if estimation_window == 1 
        predict pwanted if event_window
    }
end
runby my_NP, by(group_id) status

assert pwanted2 == pwanted

Comment

Ale Duccio

Join Date: Nov 2017

Posts: 4
#9

17 Nov 2017, 15:22

Thanks for your reply Robert! The code I was using is actually from the Princeton University Library, as you said. I have to say that I am a newbie in Stata so It will take a bit of time to implement this new version you provided me to my specific case, but it seems perfect. I am going to keep you posted about this issue in the next days.
Again, thanks all of you; it really helps to receive advice from more "senior" users.

Alessandro
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

#10

18 Nov 2017, 10:12

With even more time to let this simmer in my head, I had a second look at the Princeton pages and it turns out that you can simplify the whole procedure even more. The data preparation code (for cases where there can be multiple events per company) is far more complicated than needed and can be reduced to:

Code:

use "eventdates.dta", clear
bysort company_id (event_date): gen event_id = _n
joinby company_id using "stockdata.dta"
egen group_id = group(company_id event_id)
isid group_id date, sort
save "raw_data2use.dta", replace

In other words, joinby is used to form all pairwise combinations of daily stock observations with event observations within each company_id.

There is also no need to further clean the data in order to perform the analysis. You can estimate normal performance based on trading days using:

Code:

use "raw_data2use.dta", clear

* define trading days, would be better to use Stata's business calendar
by group_id: gen  tday = _n
by group_id: egen etday = max(tday * (date == event_date))
label var etday "Event trading day"

* estimation window bounds
gen low  = etday - 60
gen high = etday - 31

* regression using observations within the estimation window bounds
rangestat (reg) ret market_return, interval(tday low high) by(group_id)

* calculate predicted returns for the event window
gen pwanted = market_return * b_market_return + b_cons if inrange(tday, etday-2, etday+2)

You can spot check the results for group_id == 2 using:

Code:

regress ret market_return if group_id == 2 & inrange(tday, low, high)
predict pcheck if group_id == 2
list group_id event_date date tday etday b_market_return b_cons pwanted pcheck if ///
    group_id == 2 & inrange(tday, etday-2, etday+2)

and the results

Code:

. * spot check for group_id == 2
. regress ret market_return if group_id == 2 & inrange(tday, low, high)

      Source |       SS           df       MS      Number of obs   =        30
-------------+----------------------------------   F(1, 28)        =     28.28
       Model |  .001001188         1  .001001188   Prob > F        =    0.0000
    Residual |    .0009913        28  .000035404   R-squared       =    0.5025
-------------+----------------------------------   Adj R-squared   =    0.4847
       Total |  .001992488        29  .000068706   Root MSE        =    .00595

-------------------------------------------------------------------------------
          ret |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
market_return |   .7696168   .1447239     5.32   0.000     .4731633     1.06607
        _cons |   .0011913   .0010937     1.09   0.285    -.0010491    .0034318
-------------------------------------------------------------------------------

. predict pcheck if group_id == 2
(option xb assumed; fitted values)
(57,792 missing values generated)

. list group_id event_date date tday etday b_market_return b_cons pwanted pcheck if ///
>         group_id == 2 & inrange(tday, etday-2, etday+2)

       +-------------------------------------------------------------------------------------------------+
       | group_id   event_d~e        date   tday   etday   b_marke~n      b_cons     pwanted      pcheck |
       |-------------------------------------------------------------------------------------------------|
  354. |        2   05sep2007   31aug2007    168     170   .76961676   .00119135    .0106553    .0106553 |
  355. |        2   05sep2007   04sep2007    169     170   .76961676   .00119135    .0097479    .0097479 |
  356. |        2   05sep2007   05sep2007    170     170   .76961676   .00119135   -.0065687   -.0065687 |
  357. |        2   05sep2007   06sep2007    171     170   .76961676   .00119135    .0046023    .0046023 |
  358. |        2   05sep2007   07sep2007    172     170   .76961676   .00119135   -.0113157   -.0113157 |
       +-------------------------------------------------------------------------------------------------+

Announcement