Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • forvalues and "no observations" problem

    Hello everyone,
    I need help with the following code. I am having problems in running the following forvalues command in my Event Study about M&A:

    forvalues i=1(1)`r(max)' {
    l id event_id if id==`i' & dif==0
    qui reg ret market_return if id==`i' & estimation_window==1
    predict p if id==`i'
    replace predicted_return = p if id==`i' & event_window==1
    drop p
    }
    *

    Everytime I launch this code that is aimed at creating an expected, market model return according to the data I have about stock returns and market returns, I get an error message that says I have "no observations" after id=43 (the total number of idea is roughly 1800).
    At id 43 I have a column of "NA" data, so I guess that's the problem.
    I have tried to apply the "capture" command in front of "reg" and in front of "predict" and "replace" too, but things still don't work.
    Do you have any ideas about how to override or avoid this issue?
    Thanks in advance to anyone who will reply.

    Alessandro Duccio

  • #2
    Is the second line of your code "Isid ..." and not "I id ..."? If it's isid, would you not get an error at that code line if all your IDs are NAs?

    Comment


    • #3
      Hi Dave, thanks for your reply. The second line is "list", not Isid. Sorry for the imprecision. what do you think about the code? Feel free to ask if you need further details.

      Alessandro

      Comment


      • #4
        You should search through the archives directly or via Google. I did and found "capture noisily" solves your problem in a foreach loop:

        Code:
        clear
        sysuse auto
        recode rep78 (1 2=1) (3=2) (4=3) (5=4), generate(repair)
        replace repair = . if repair==1
        forvalues i = 1(1)4 {
            capture noisily regress weight length if repair==`i' 
        }

        Comment


        • #5
          Again, thank you Dave. I tried to put that command in the code as you suggested but it did not solve the problem. It showed all the results of the regressions run, but in the end it did not allow the cycle to go beyond id 43 with the calculations.

          Comment


          • #6
            You need to capture the error for the regress command AND only proceed if there was no error. To give readers a bit of context, you appear to be using code from the Princeton University Library on event studies using Stata. You can find the data preparation page here and the code you use in #1 here. I think that the instructions are a bit dated so I have reworked these to use modern Stata syntax for the data preparation. This requires two datasets that can be downloaded to Stata's current directory using:
            Code:
            copy http://dss.princeton.edu/sampleData/eventdates.dta eventdates.dta
            copy http://dss.princeton.edu/sampleData/stockdata.dta stockdata.dta
            The code below follows the instructions for using training days:
            Code:
            use "eventdates.dta", clear
            by company_id: gen eventcount=_N
            by company_id: keep if _n==1
            sort company_id
            keep company_id eventcount 
            save "eventcount.dta", replace
            
            use stockdata, clear
            merge m:1 company_id using "eventcount.dta", keep(match) nogen
            
            expand eventcount
            drop eventcount
            bysort company_id date: gen set = _n
            isid company_id set date, sort
            save "stockdata2.dta", replace
            
            use "eventdates.dta", clear
            by company_id: gen set = _n
            sort company_id set 
            save "eventdates2.dta", replace
            
            use "stockdata2.dta", clear
            merge m:1 company_id set using "eventdates2.dta", keep(match) nogen
            
            egen group_id = group(company_id set)
            save "raw_data2use", replace
            The next step is to clean-up the data and calculate the event and estimation window. As per the instructions at the end of the data preparation page, group_id is used in lieu of company_id:
            Code:
            use "raw_data2use.dta", clear
            rename company_id company_id0
            isid group_id date, sort
            
            * the number of days from the observation to the event date using trading days
            by group_id: gen tday = _n
            by group_id: egen etday = total(tday / (date == event_date))
            gen dif = tday - etday if etday != 0
            
            * define event window
            by group_id: gen event_window = inrange(dif, -2, 2)
            by group_id: egen count_event_obs = total(event_window)
            
            * define estimation window
            by group_id: gen estimation_window = inrange(dif, -60, -31)
            by group_id: egen count_est_obs   = total(estimation_window)
            
            drop if count_event_obs < 5
            drop if count_est_obs < 30
            
            save "data2use.dta", replace
            Here's how to estimate Normal Performance using runby (from SSC). To replicate the error that you describe in #1, I insert missing data for id 43. With runby, if the user's program terminates with an error, no result is saved so the my_NP program includes an overall capture statement. This way, the program will stop at the first error but the by-group's data observations will remain.
            Code:
            * Estimating Normal Performance using runby (from SSC)
            clear all
            use "data2use.dta"
            
            * create a problem for id 43
            egen id = group(group_id)
            replace market_return = . if id == 43
            
            program my_NP
                capture {
                    reg ret market_return if estimation_window == 1 
                    predict pwanted if event_window
                }
            end
            runby my_NP, by(group_id)
            With the results still in memory, here's how you would replicate the results using the Princeton's approach:
            Code:
            gen predicted_return=.
            sum id, meanonly
            local N = r(max)
            
            forvalues i=1(1)`N' { 
                l id company_id if id==`i' & dif==0
                cap noi reg ret market_return if id==`i' & estimation_window==1 
                if _rc == 0 {
                    predict p if id==`i'
                    replace predicted_return = p if id==`i' & event_window==1 
                    drop p
                }
            }  
            
            assert pwanted == predicted_return
            Needless to say, I think the runby approach is much simpler and will be more efficient, particularly if the number of by-groups is large.

            Comment


            • #7
              I downloaded runby. Neat!

              Comment


              • #8
                Upon further reflection, if execution time is a concern, you will get there much faster using rangestat (also from SSC):

                Code:
                clear all
                use "data2use.dta"
                
                * create a problem for id 43
                egen id = group(group_id)
                replace market_return = . if id == 43
                
                * the rangestat solution
                gen one = 1
                rangestat (reg) ret market_return, interval(estimation_window one one) by(group_id)
                gen pwanted2 = market_return * b_market_return + b_cons if event_window
                
                * the runby solution
                program my_NP 
                    capture {
                        reg ret market_return if estimation_window == 1 
                        predict pwanted if event_window
                    }
                end
                runby my_NP, by(group_id) status
                
                assert pwanted2 == pwanted

                Comment


                • #9
                  Thanks for your reply Robert! The code I was using is actually from the Princeton University Library, as you said. I have to say that I am a newbie in Stata so It will take a bit of time to implement this new version you provided me to my specific case, but it seems perfect. I am going to keep you posted about this issue in the next days.
                  Again, thanks all of you; it really helps to receive advice from more "senior" users.

                  Alessandro

                  Comment


                  • #10
                    With even more time to let this simmer in my head, I had a second look at the Princeton pages and it turns out that you can simplify the whole procedure even more. The data preparation code (for cases where there can be multiple events per company) is far more complicated than needed and can be reduced to:
                    Code:
                    use "eventdates.dta", clear
                    bysort company_id (event_date): gen event_id = _n
                    joinby company_id using "stockdata.dta"
                    egen group_id = group(company_id event_id)
                    isid group_id date, sort
                    save "raw_data2use.dta", replace
                    In other words, joinby is used to form all pairwise combinations of daily stock observations with event observations within each company_id.

                    There is also no need to further clean the data in order to perform the analysis. You can estimate normal performance based on trading days using:
                    Code:
                    use "raw_data2use.dta", clear
                    
                    * define trading days, would be better to use Stata's business calendar
                    by group_id: gen  tday = _n
                    by group_id: egen etday = max(tday * (date == event_date))
                    label var etday "Event trading day"
                    
                    * estimation window bounds
                    gen low  = etday - 60
                    gen high = etday - 31
                    
                    * regression using observations within the estimation window bounds
                    rangestat (reg) ret market_return, interval(tday low high) by(group_id)
                    
                    * calculate predicted returns for the event window
                    gen pwanted = market_return * b_market_return + b_cons if inrange(tday, etday-2, etday+2)
                    You can spot check the results for group_id == 2 using:
                    Code:
                    regress ret market_return if group_id == 2 & inrange(tday, low, high)
                    predict pcheck if group_id == 2
                    list group_id event_date date tday etday b_market_return b_cons pwanted pcheck if ///
                        group_id == 2 & inrange(tday, etday-2, etday+2)
                    and the results
                    Code:
                    . * spot check for group_id == 2
                    . regress ret market_return if group_id == 2 & inrange(tday, low, high)
                    
                          Source |       SS           df       MS      Number of obs   =        30
                    -------------+----------------------------------   F(1, 28)        =     28.28
                           Model |  .001001188         1  .001001188   Prob > F        =    0.0000
                        Residual |    .0009913        28  .000035404   R-squared       =    0.5025
                    -------------+----------------------------------   Adj R-squared   =    0.4847
                           Total |  .001992488        29  .000068706   Root MSE        =    .00595
                    
                    -------------------------------------------------------------------------------
                              ret |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    --------------+----------------------------------------------------------------
                    market_return |   .7696168   .1447239     5.32   0.000     .4731633     1.06607
                            _cons |   .0011913   .0010937     1.09   0.285    -.0010491    .0034318
                    -------------------------------------------------------------------------------
                    
                    . predict pcheck if group_id == 2
                    (option xb assumed; fitted values)
                    (57,792 missing values generated)
                    
                    . list group_id event_date date tday etday b_market_return b_cons pwanted pcheck if ///
                    >         group_id == 2 & inrange(tday, etday-2, etday+2)
                    
                           +-------------------------------------------------------------------------------------------------+
                           | group_id   event_d~e        date   tday   etday   b_marke~n      b_cons     pwanted      pcheck |
                           |-------------------------------------------------------------------------------------------------|
                      354. |        2   05sep2007   31aug2007    168     170   .76961676   .00119135    .0106553    .0106553 |
                      355. |        2   05sep2007   04sep2007    169     170   .76961676   .00119135    .0097479    .0097479 |
                      356. |        2   05sep2007   05sep2007    170     170   .76961676   .00119135   -.0065687   -.0065687 |
                      357. |        2   05sep2007   06sep2007    171     170   .76961676   .00119135    .0046023    .0046023 |
                      358. |        2   05sep2007   07sep2007    172     170   .76961676   .00119135   -.0113157   -.0113157 |
                           +-------------------------------------------------------------------------------------------------+

                    Comment

                    Working...
                    X