Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Clyde Schechter All have to go back to the main proposition:

    If analyst i follows firms j, k, and l in quarter t, the fe_other on analyst i following firm j in t this observation is calculated as other analysts’ average forecast error on firms k and l (not j the focal firm) in quarter t‒1.

    For example if I am the analyst who follows Apple (j), Netflix (k) and Facebook (l) in 2022q3 (t), the fe_other on - me the analyst, Apple (j), 2022q3 (t) - this observation would be calculated by using the average forecast error of other analysts on Netflix (k) and Facebook (l) in 2022q2 (t-1).

    It doesn't matter whether I followed Netflix (k) and Facebook (l) in preceding quarter, 2022q2, as I don't need my own forecast error on Netflix (k) and Facebook (l) in preceding quarter to calculate the fe_other on Apple (j) in current quarter, what I need is other analysts' forecast errors on Netflix (k) and Facebook (l) in preceding quarter.

    That's why it is important that a given analyst follows more than one company in quarter t. If this is not the case, we could not even proceed to consider anything about other analysts and preceding quarter.

    For the first row - analys #70 who follows company #1174 in 2021q2. Apparently, this analyst does not follow other company in the same quarter. So there is no peer average forecast error (in t-1) on other firm that this analyst follows in current quarter (as he only follows one firm in current quarter!). fe_other for this observation will therefore be a missing value (i.e. '' . '')

    When you say this analyst does not follow other companies in the same quarter, which quarter are you referring to? 2021q2? 2021q1 (which would be t-1)?
    I'm referring to 2021q2.

    For the second row - analys #70 who follows company #512 in 2021q3. Again, this analyst only follows one firm this quarter. fe_other will be a missing value.

    Yes, but in t-1, q2021q2, he/she does follow another firm, namely 1174. So why isn't the result the value of fe_own in 2021q2, namely -.0004757?
    The thing is, this analyst does not follow any other firm in 2021q3 (t). Say if this analyst is following Apple in current quarter (t), but he/she does not follow Netflix or Facebook in current quarter - no other firms at all. In this case, we even don't need to proceed further to consider anything about other analysts and preceding quarter. fe_other for this analyst following Apple in t has to be a missing value.

    Finally when we move to row 23 - analys #53035 who follows company #427 in 2021q3. We could see that this analyst also follows another company #512 in the same quarter. Great, that's what we love to see. Now, we have to figure out peer analysts' (aka analysts other than #53035) average forecast error on company #512 in 2021q2, i.e. previous quarter. With my own manual calculation, the fe_other for this would be 0.003795.

    Apparently it is important that this analyst follows more than one company in 2021q3. But I don't understand why that is relevant for that observation: what I thought should matter is which companies the analyst followed in t-1, i.e., in 2021q2.
    I hope that up to this point this question could be solved. E.g. This analyst #53035 follows both Apple and Netflix in 2021q3. fe_other on this observation - analyst #53035, Apple, 2021q3 - would be the average forecast error of other analysts (not #53035) on Netflix in 2021q2. Same logic holds if now our focus is on - analyst #53035, Netflix, 2021q3 - The fe_other for this observation would be the average forecast error of other analysts on Apple in 2021q2.

    Comment


    • #17
      Let me paraphrase my understanding of this, which I'm pretty sure is wrong, and maybe you can try to explain in the kind of terms that I'm thinking about.

      For wanted1, for any given analyst A, time T and cusip Q, we want the mean value of fe_own for all observations of analyst A in time T-1 for all cusips other than Q, but for which there is also an observation for analyst A in time T.

      For wanted2, for any given analyst A, time T and cusip Q, we want the mean value of fe_own for all observations of analysts other than analyst A in time T-1 for all cusips other than Q and for which there is also an observation for analyst A in time T.

      I have, I think, implemented my understanding in the following code. However, it does not produce results that look like yours.
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear*
      input float a_cusip double analys float(year_q fe_own)
      1174     70 245  -.0004757255
       512     70 246             0
       427     70 249  -.0015798463
      1174   2739 245 -.00050076365
       512   9700 241             0
      1174   9700 241   -.006176837
      1174   9700 242  -.0005642372
       512   9700 246   .0012755102
       427   9700 251 -.00058193103
       427  24192 245  -.0016429804
       427  24192 246    -.00275303
       427  24192 249   -.002903501
       427  24192 251  -.0005334368
       427  30139 241  -.0030612755
      1174  30139 245   .0002729162
       512  30139 246 -.00009920635
       512  30139 249    -.00338763
       427  49483 241    -.00523508
      1174  49483 241   -.013449565
       427  53035 241  -.0031310765
      1174  53035 242  -.0005642372
       512  53035 245  -.0001686625
       427  53035 246  -.0023209574
       512  53035 246  -.0014880953
       427  53035 249  -.0029590095
       512  53035 249  .00011803588
      1174  53035 249   -.001895612
       427  53035 251 -.00024247127
      1174  57524 242  -.0007359615
       512  57524 245    .005059875
      1174  57524 245   .0018778638
       427  57524 246  -.0022177189
       512  57524 246   .0009920635
      1174  57524 247 -.00011457116
      1174  57524 249  -.0022618098
      1174  71787 245  -.0001752673
       427  71787 249   -.003415884
       427  71787 251 -.00029096552
      1174  74192 247   .0007103412
       512  75101 245   -.001686625
       512  75101 246  .00056689343
       427  75101 247   .0004582951
       427  75101 249  -.0039282665
       427  78782 241   -.004636785
       512  78782 241             0
      1174  78782 241   -.013150685
      1174  78782 242   -.000932218
       427  78782 245  -.0012733098
       427  78782 246   -.001797117
      1174  78782 247 -.00009165693
       512  78782 249             0
       427  78782 251   -.000387954
       427  82682 241   -.003041332
       512  82682 241   .0021708792
       427  82682 245 -.00008214902
       512  82682 245    .006409175
       512  82682 246  -.0004251701
      1174  82682 249  -.0021971867
      1174  89984 241   -.010510585
       427  89984 245  -.0011090117
       427  89984 246   -.001797117
      1174 106368 241   -.005753424
      1174 106368 242  -.0007359615
       427 106368 249   -.005038429
       512 106368 249   -.003186969
      1174 108054 241   -.005952678
      1174 108054 242  -.0006868974
       512 108054 245   .0045538875
       427 108054 246  -.0017588804
       512 112207 241   -.013025275
      1174 112207 242  -.0003679808
       427 112207 249   -.001622545
       512 112207 249 -.00023607176
      1174 112207 249  -.0015294143
      1174 112415 241    -.00622665
       512 112415 245   .0045538875
       427 112415 249   -.003885568
      1174 122937 247  -.0002291423
       427 122937 249   -.003330487
       427 130592 241    .001545595
      1174 130592 242  -.0007114295
       427 130592 245  -.0006982667
       512 130592 245    .003879238
      1174 157521 249  -.0021541046
      1174 163969 242  -.0006868974
       427 164367 246  -.0022177189
       427 186081 247    .002635197
       512 186081 249   -.008498584
      1174 186081 249  -.0016586606
       512 193740 249   -.003068933
       427 196999 249   -.002860803
      1174 196999 249  -.0019817762
      end
      format %tq year_q
      
      tempfile copy
      save `copy'
      
      //  VERIFY ANALYST, QUARTER, AND CUSIP UNIQUELY IDENTIFY OBSERVATIONS
      isid analys year_q a_cusip
      
      //  ASSOCIATE TO EACH OBSERVATION ALL OBSERVATIONS BY THE SAME ANALYST IN
      //  THE PRECEDING QUARTER
      rangejoin year_q -1 -1 using `copy', by(analys)
      
      //  IDENTIFY THOSE FOR WHICH THE ANALYST ALSO OBSERVED THEM IN THE CURRENT YEAR
      rangestat (count) fe_own, by(analys) interval(year_q_U year_q year_q)
      
      //  CALCULATE ANALYS' MEAN FE OVER ALL CUSIPS HE/SHE COVERED IN BOTH YEAR_Q
      //  AND THE PRECEDING QUARTER, EXCLUDING SELF
      by analys year_q a_cusip, sort: egen wanted1 = ///
          mean(cond(!missing(fe_own_count) & a_cusip_U != a_cusip, fe_own, .))
      drop if missing(wanted1)
          
      //  NOW ASSOCIATE TO EACH OF THESE, ALL OBSERVATIONS OF THE SAME CUSIP_U
      //  BY OTHER ANALYSTS IN PRECEDING QUARTER
      rangejoin year_q year_q_U year_q_U using `copy', suffix("_V")
      drop if (analys == analys_V) | (a_cusip_U != a_cusip_V)
      drop a_cusip_V
      
      //  CALCULATE MEAN OF THESE OTHER ANALYSTS FE'S OVER THE OTHER CUSIPS
      by analys year_q a_cusip, sort: egen wanted2 = ///
          mean(cond(a_cusip_U != a_cusip, fe_own_V, .))
          
      //  REDUCE TO ONE OBSERVATION PER ANALYST, QUARTER, CUSIP
      by analys year_q a_cusip: keep if _n == 1
      keep analys year_q a_cusip wanted*
      
      //  MERGE BACK TO THE ORIGINAL DATA SET
      merge 1:1 analys year_q a_cusip using `copy', assert(match using) nogenerate
      order wanted*, last
      isid analys year_q a_cusip, sort
      Note: I noticed you have 1,000,000 observations in your data set. That requires doing things in ways that conserve memory. If I just keep joining the data sets without dropping irrelevant observations along the way you will end up trying to hold a trillion observations in memory, which will not be possible. So I'm now dropping things liberally as we go, with the thought of joining the wanted1 and wanted2 results back to the original data at the end

      If you can explain in a way I understand where I'm going wrong, I can try one more time to fix it. If not, I think I will have to give up on it here and hope that you can figure out if there is a way to get from this to what you want.

      Comment


      • #18
        Clyde Schechter Thank you for your help. Especially the merge part that makes me realize I could do it this way. So here is my final code that I used to arrive at the fe_other. may not be the fastest or finest way but at least it seems to be working.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float a_cusip double analys float(year_q fe_own)
        1174     70 245  -.0004757255
         512     70 246             0
         427     70 249  -.0015798463
        1174   2739 245 -.00050076365
         512   9700 241             0
        1174   9700 241   -.006176837
        1174   9700 242  -.0005642372
         512   9700 246   .0012755102
         427   9700 251 -.00058193103
         427  24192 245  -.0016429804
         427  24192 246    -.00275303
         427  24192 249   -.002903501
         427  24192 251  -.0005334368
         427  30139 241  -.0030612755
        1174  30139 245   .0002729162
         512  30139 246 -.00009920635
         512  30139 249    -.00338763
         427  49483 241    -.00523508
        1174  49483 241   -.013449565
         427  53035 241  -.0031310765
        1174  53035 242  -.0005642372
         512  53035 245  -.0001686625
         427  53035 246  -.0023209574
         512  53035 246  -.0014880953
         427  53035 249  -.0029590095
         512  53035 249  .00011803588
        1174  53035 249   -.001895612
         427  53035 251 -.00024247127
        1174  57524 242  -.0007359615
         512  57524 245    .005059875
        1174  57524 245   .0018778638
         427  57524 246  -.0022177189
         512  57524 246   .0009920635
        1174  57524 247 -.00011457116
        1174  57524 249  -.0022618098
        1174  71787 245  -.0001752673
         427  71787 249   -.003415884
         427  71787 251 -.00029096552
        1174  74192 247   .0007103412
         512  75101 245   -.001686625
         512  75101 246  .00056689343
         427  75101 247   .0004582951
         427  75101 249  -.0039282665
         427  78782 241   -.004636785
         512  78782 241             0
        1174  78782 241   -.013150685
        1174  78782 242   -.000932218
         427  78782 245  -.0012733098
         427  78782 246   -.001797117
        1174  78782 247 -.00009165693
         512  78782 249             0
         427  78782 251   -.000387954
         427  82682 241   -.003041332
         512  82682 241   .0021708792
         427  82682 245 -.00008214902
         512  82682 245    .006409175
         512  82682 246  -.0004251701
        1174  82682 249  -.0021971867
        1174  89984 241   -.010510585
         427  89984 245  -.0011090117
         427  89984 246   -.001797117
        1174 106368 241   -.005753424
        1174 106368 242  -.0007359615
         427 106368 249   -.005038429
         512 106368 249   -.003186969
        1174 108054 241   -.005952678
        1174 108054 242  -.0006868974
         512 108054 245   .0045538875
         427 108054 246  -.0017588804
         512 112207 241   -.013025275
        1174 112207 242  -.0003679808
         427 112207 249   -.001622545
         512 112207 249 -.00023607176
        1174 112207 249  -.0015294143
        1174 112415 241    -.00622665
         512 112415 245   .0045538875
         427 112415 249   -.003885568
        1174 122937 247  -.0002291423
         427 122937 249   -.003330487
         427 130592 241    .001545595
        1174 130592 242  -.0007114295
         427 130592 245  -.0006982667
         512 130592 245    .003879238
        1174 157521 249  -.0021541046
        1174 163969 242  -.0006868974
         427 164367 246  -.0022177189
         427 186081 247    .002635197
         512 186081 249   -.008498584
        1174 186081 249  -.0016586606
         512 193740 249   -.003068933
         427 196999 249   -.002860803
        1174 196999 249  -.0019817762
        end
        format %tq year_q
        
        tempfile copy
        save `copy'
        
        rangejoin year_q 0 0 using `copy', by(analys)
        drop if a_cusip == a_cusip_U
        save "copy1.dta", replace
        
        use `copy', clear
        merge 1:m a_cusip analys year_q fe_own using "copy1.dta"
        drop _merge
        
        preserve
        keep a_cusip analys year_q fe_own
        rename (a_cusip) =_U
        duplicates drop
        save `copy', replace
        restore
        
        rangejoin year_q -1 -1 using `copy', by(a_cusip_U) suffix("_V")
        drop if analys_V == analys
        by a_cusip analys year_q, sort: egen fe_other = mean(fe_own_V)
        keep a_cusip analys year_q fe_own fe_other
        duplicates drop
        sort analys year_q
        Thanks for your patience and help. Much appreciated!

        Comment

        Working...
        X