Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    I think there is a typographical problem that has crept into your code somehow. When I create a suitable data set and directly type -gen x2= (d.revt - d.rect)/L.at- at the command window, it runs without error messages. When I then -drop x2-, and copy/paste your code (which looks the same to the eye) into the command window and run it, I get the same error message you do.

    Dissecting your code a bit by creating a data set in which that one command is the actual data in an observation, the -charlist- command shows me that it contains 3 non-printing characters in addition to the ones we can see with our eyes. So I think all you need to do is delete that line of code from your do-file and then type it in directly from the keyboard and you'll be fine. If that's not the case, please post back and include an example of your data using the -dataex- command.

    If the code was at any point passed through a word processing program or some other program (or maybe even the Statalist Forum editor), it can get "contaminated" with "control characters" used by that program. These characters are not recognized by Stata and can completely confuse the parser, leading to error messages, and sometimes error messages that are misleading.

    Comment


    • #32
      Originally posted by Mahmud Hossain View Post
      Hi all:
      I'm new in stata. Was trying to estimate discretionary accrual, and was using user posted command. Was getting error message.
      The full message is posted below!
      Any help would be simply great!!
      Regards,

      Mahmud

      gen sic_2= substr(sic,1,2)

      . destring sic_2, replace
      sic_2: all characters numeric; replaced as byte

      .
      . egen combo= group(sic_2 fyear)
      (575 missing values generated)

      gen uhat=.
      (254,697 missing values generated)

      .
      end of do-file

      . do "C:\Users\mhossain\AppData\Local\Temp\STD784_00000 0.tmp"

      . xtset gvkey fyear
      panel variable: gvkey (unbalanced)
      time variable: fyear, 1995 to 2018, but with gaps
      delta: 1 unit

      .
      . gen obs= [_n]

      . summ obs

      Variable | Obs Mean Std. Dev. Min Max
      -------------+---------------------------------------------------------
      obs | 254,697 127349 73524.84 1 254697

      . scalar e= r(min)

      . scalar f= r(max)

      .
      . gen ta= (ib-oancf)/L.at
      (81,677 missing values generated)

      . gen x1= 1/L.at
      (65,104 missing values generated)

      . gen x2= (d.revt – d.rect)/L.at
      d: operator invalid
      r(198);

      end of do-file

      r(198);
      I think Clyde's suspicion is correct. I assume you copied the code from the following page: https://robsonglasscock.wordpress.co...cruals-update/

      The author mentioned right under -gen x2= (d.revt – d.rect)/L.at- in the original post above that there was some issue when the code is copied from the webpage into a do-file. The fix recommended by the author is below.

      gen x2= (d.revt – d.rect)/L.at
      /* The minus sign above pasted in an odd manner in the original blog post
      triggering one error. This was simply a cut and paste issue. To fix, manually
      change the cut-and-pasted minus sign to a new, manually entered minus sign. */

      Comment


      • #33
        Dear all,

        First, thank you very much for your code to estimate Discretionary accruals. However, my supervisor said that the regression model for each industry need to have a significant F or else that industry need to be discard. Could you please help me to write the code for this condition?

        Thank you very much in advance.

        Celine.

        Comment


        • #34
          Hi Celine,

          I'm not sure, but it could mean that he/she wants you to include the sic2 industry indicators, and then remove those that aren't statistically significant.
          I had a Stata dataset of Compustat firms in IT industries( 2-digit SIC of 35, 36, 38, 48, 73) for 1996-2005 handy.

          If you actually have to do an F-test (calc an F-statistic) testing whether all of the industry indicators are jointly==0, then see:
          Code:
          gen sic2 = int(dnum / 100)
          
          . tabulate year sic2
          
              fiscal |
                year |             2-Digit SIC (int(dnum / 100))
            (=yeara) |        35         36         38         48         73 |     Total
          -----------+-------------------------------------------------------+----------
                1996 |       219        378        177        203        640 |     1,617 
                1997 |       227        394        188        212        688 |     1,709 
                1998 |       211        366        172        208        705 |     1,662 
                1999 |       208        374        167        230        873 |     1,852 
                2000 |       199        419        168        241        930 |     1,957 
                2001 |       189        397        160        225        820 |     1,791 
                2002 |       184        378        154        193        753 |     1,662 
                2003 |       171        358        155        180        663 |     1,527 
                2004 |       154        364        148        165        590 |     1,421 
                2005 |       146        365        148        160        547 |     1,366 
          -----------+-------------------------------------------------------+----------
               Total |     1,908      3,793      1,637      2,017      7,209 |    16,564
          Code:
          * Some bogus regressions using firm's end-of-year market capitalization (in $1996 constant dollars) on some variables I had handy
          
          . reg marketcap_end96 data12 begin_asset96 acq_cash_assets i.yeara i.sic2
          * omitted year==1996;  omitted sic2 = 35 (Computer Hardware)
          * data12 was sales under old Compustat variable names
          
          
                Source |       SS           df       MS      Number of obs   =    13,610
          -------------+----------------------------------   F(16, 13593)    =    678.18
                 Model |  1.4590e+12        16  9.1188e+10   Prob > F        =    0.0000
              Residual |  1.8277e+12    13,593   134459720   R-squared       =    0.4439
          -------------+----------------------------------   Adj R-squared   =    0.4433
                 Total |  3.2867e+12    13,609   241510406   Root MSE        =     11596
          
          ---------------------------------------------------------------------------------
          marketcap_end96 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ----------------+----------------------------------------------------------------
                   data12 |    1.38661   .0315757    43.91   0.000     1.324717    1.448503
            begin_asset96 |   .1605242   .0193756     8.28   0.000     .1225454    .1985031
          acq_cash_assets |   91.53494    49.6281     1.84   0.065    -5.743008    188.8129
                          |
                    yeara |
                    1997  |   343.8581   447.2731     0.77   0.442    -532.8592    1220.575
                    1998  |   1102.299   446.9197     2.47   0.014     226.2741    1978.323
                    1999  |   3523.423     453.76     7.76   0.000     2633.991    4412.856
                    2000  |   1890.619   443.6988     4.26   0.000     1020.908     2760.33
                    2001  |   436.2058   437.8795     1.00   0.319    -422.0987     1294.51
                    2002  |  -454.7354   446.9616    -1.02   0.309    -1330.842    421.3712
                    2003  |   95.90589   456.6593     0.21   0.834    -799.2096    991.0213
                    2004  |     3.8964   460.7752     0.01   0.993    -899.2869    907.0797
                    2005  |   289.9874   483.7544     0.60   0.549    -658.2382    1238.213
                          |
                     sic2 |
                      36  |   1188.402   350.7109     3.39   0.001     500.9604    1875.844
                      38  |   37.11065   416.2306     0.09   0.929    -778.7589    852.9802
                      48  |   1540.933   426.3282     3.61   0.000     705.2712    2376.596
                      73  |   1036.231   326.0885     3.18   0.001     397.0522    1675.409
                          |
                    _cons |  -835.9087   417.3979    -2.00   0.045    -1654.066   -17.75097
          ---------------------------------------------------------------------------------
          If the above regression were your result, presumably your advisor would want you to omit the control for sic2==38.
          (NOTE: All of the industry indicator variables lose their significance when I cluster by firm.)


          Code:
          . reg marketcap_end96 data12 begin_asset96 acq_cash_assets i.yeara i.sic2, vce(cluster gvkey)
          
          Linear regression                               Number of obs     =     13,610
                                                          F(16, 2591)       =      29.29
                                                          Prob > F          =     0.0000
                                                          R-squared         =     0.4439
                                                          Root MSE          =      11596
          
                                           (Std. Err. adjusted for 2,592 clusters in gvkey)
          ---------------------------------------------------------------------------------
                          |               Robust
          marketcap_end96 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          ----------------+----------------------------------------------------------------
                   data12 |    1.38661   .2775466     5.00   0.000     .8423744    1.930846
            begin_asset96 |   .1605242   .1334535     1.20   0.229    -.1011621    .4222106
          acq_cash_assets |   91.53494    51.6728     1.77   0.077    -9.789227    192.8591
                          |
                    yeara |
                    1997  |   343.8581   71.83448     4.79   0.000     202.9993    484.7169
                    1998  |   1102.299   193.6093     5.69   0.000     722.6541    1481.943
                    1999  |   3523.423   486.7877     7.24   0.000     2568.891    4477.955
                    2000  |   1890.619   376.4225     5.02   0.000       1152.5    2628.739
                    2001  |   436.2058   180.2204     2.42   0.016     82.81518    789.5965
                    2002  |  -454.7354   133.4354    -3.41   0.001    -716.3862   -193.0847
                    2003  |   95.90589   130.7137     0.73   0.463    -160.4079    352.2197
                    2004  |     3.8964   165.8468     0.02   0.981    -321.3093     329.102
                    2005  |   289.9874   251.0689     1.16   0.248    -202.3287    782.3034
                          |
                     sic2 |
                      36  |   1188.402   944.6753     1.26   0.209    -663.9926    3040.797
                      38  |   37.11065   793.7182     0.05   0.963    -1519.275    1593.497
                      48  |   1540.933   1212.884     1.27   0.204    -837.3858    3919.253
                      73  |   1036.231   952.5825     1.09   0.277    -831.6692    2904.131
                          |
                    _cons |  -835.9087   738.2989    -1.13   0.258    -2283.624    611.8069
          ---------------------------------------------------------------------------------

          Comment


          • #35
            David Benson You are proposing dummy variables for each industry. In fact, the Modified Jones model uses cross-sectional industry regressions, therefore the F-test here refers to the F-test of the industry-specific regression model, not the industry-specific coefficient in a pooled regression.
            Regards
            --------------------------------------------------
            Attaullah Shah, PhD.
            Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
            FinTechProfessor.com
            https://asdocx.com
            Check out my asdoc program, which sends outputs to MS Word.
            For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

            Comment


            • #36
              David Benson Attaullah Shah Robson Glasscock

              Thank you very much for your support. However, maybe my question is not clear enough, hence your answer is not the one for my question.

              After I use the code of Robson Glasscock to estimate the Discretionary accruals, I get the result already. I also do checking and my result is as follows:

              Code:
              preserve
              
              . reg ta x1-x4 if Indus=="Basic Materials" & year==2008 & obs!=627, nocons
              
                    Source |       SS       df       MS              Number of obs =       7
              -------------+------------------------------           F(  4,     3) =    4.58
                     Model |  .068752185     4  .017188046           Prob > F      =  0.1208
                  Residual |  .011253915     3  .003751305           R-squared     =  0.8593
              -------------+------------------------------           Adj R-squared =  0.6718
                     Total |  .080006101     7  .011429443           Root MSE      =  .06125
              
              ------------------------------------------------------------------------------
                        ta |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                        x1 |   -1814563    1595033    -1.14   0.338     -6890670     3261543
                        x2 |  -.2785775   .0786377    -3.54   0.038    -.5288377   -.0283172
                        x3 |   .0966852   .0601268     1.61   0.206    -.0946651    .2880354
                        x4 |   5.61e+07   1.93e+07     2.90   0.063     -5486172    1.18e+08
              ------------------------------------------------------------------------------
              
              . predict u_hat,resid
              (100 missing values generated)
              
              . list u_hat if obs==627
              
                   +-----------+
                   |     u_hat |
                   |-----------|
              627. | -.2343639 |
                   +-----------+
              
              .
              end of do-file

              Here is my supervisor's comment : "The models for each industry need to have a significant F statistic or else you need to discard that industry. DA need to be worked out on all companies except those in your firm list".

              Based on her idea, should I keep the result above? (because the F-test is not significant)

              Thank you very much in advance.

              Regards,

              Celine.
              Last edited by Celine Tran; 06 Dec 2018, 17:40.

              Comment


              • #37
                Hi everyone,

                I followed all your recommendations and was testing the code from Robson Glasscock's website and from Clyde and Nick's updates against Roychowdhury's (2006) table 4 and my coefficient for the Suspect_NI variable (defined as 1 if Net Income/ Assets is greater than 0 but less than 0.005) seems to have the wrong sign. I downloaded all the data from Compustat as in the paper (annual data between 1987 and 2001) but since the code still took forever I only used data from 1991 to 2001. Would anyone be able to let me know what I could have possibly done wrong in my attempt at combining all the different resources, which resulted in me getting opposite sign results? I will attach my code here and the tables I obtained.

                Code:
                clear all
                cls
                set maxvar 32767 
                 
                use "C:\Users\ejsequeira\Desktop\Test_Roychow\roych_data.dta"
                 
                
                 gen sic_2 = substr(sic, 1,2)
                 destring sic_2, replace
                  
                 *gen date = year(datadate)
                 
                 *format date %ty
                 
                 destring gvkey, replace
                 sort gvkey fyear
                 
                 gen dup =1 if gvkey==gvkey[_n-1] & fyear==fyear[_n-1]
                 sum dup
                 
                 drop if dup==1
                 drop dup
                 
                 drop if fyear <1991
                
                 egen combo = group (sic_2 fyear)
                 levelsof combo, local (a)
                
                 
                 
                 xtset gvkey fyear 
                 gen u_hat_ram1=.
                 gen obs = [_n]
                 summ obs
                 
                 *gen runn=1
                 
                 
                 gen oancf_ram=oancf/L1.at
                 gen l_at=L1.at
                 gen ram1=1/L1.at
                 gen ram2=sale/L1.at
                 gen ram3=S1.sale/L1.at
                 
                 
                 *compress 
                 forvalues j = 1/`=_N'{
                    capture noisily{
                        reg oancf_ram ram1-ram3 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
                        if e(N)>=10{
                            predict uhat_2 in `j', resid
                            replace u_hat_ram1= uhat_2 in `j'
                            drop uhat_2
                            }
                        }
                    }
                 
                 
                 summ u_hat_ram1
                 
                 rename u_hat_ram1 ab_cfo
                 summ ab_cfo
                 
                 gen prod = (cogs+S1.invt)/L1.at
                 *gen l_at=L1.at
                 gen ram4=1/L1.at
                 gen ram5=sale/L1.at
                 gen ram6=S1.sale/L1.at
                 gen ram7=S1.L1.sale/L1.at
                 gen uhat_prod=.
                 
                  forvalues j = 1/`=_N'{
                    capture noisily{
                        reg prod ram4-ram7 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
                        if e(N)>=10{
                            predict uhat_2 in `j', resid
                            replace uhat_prod= uhat_2 in `j'
                            drop uhat_2
                            }
                        }
                    }
                 rename uhat_prod ab_prod
                 summ ab_prod
                 
                 drop ram4 ram5 ram6 ram7 l_at
                 
                 gen disexp = (xrd+xsga)/L1.at
                 count if obs!=. & disexp==.
                 replace disexp = xsga/L1.at if xrd==.
                 
                 gen l_at=L1.at
                 gen ram8=1/l_at
                 gen ram9 = L1.sale/L1.at
                 gen uhat_disexp=.
                 
                 
                  forvalues j = 1/`=_N'{
                    capture noisily{
                        reg disexp ram8 ram9 if sic_2==sic_2[`j'] & fyear == fyear[`j'] & _n!= `j', nocons
                        if e(N)>=10{
                            predict uhat_2 in `j', resid
                            replace uhat_disexp= uhat_2 in `j'
                            drop uhat_2
                            }
                        }
                    }
                 
                 rename uhat_disexp ab_dis_exp
                 summ ab_dis_exp
                 
                 gen ram = ab_prod - ab_cfo - ab_dis_exp
                 summ ram
                 
                 
                 gen size=log(csho*prcc_f)
                 gen m2b= (csho*prcc_f)/seq
                 *drop suspect_ni
                 gen suspect_ni = 0
                 replace suspect_ni=1 if (ni/at>0 & ni/at<0.005)
                 
                 gen ni_reg = ni/at
                 
                 reg ab_cfo size m2b ni_reg suspect_ni
                 
                 reg ab_dis_exp size m2b ni_reg suspect_ni
                 
                 reg ab_prod size m2b ni_reg suspect_ni
                The output I obtained is as follows:

                Click image for larger version

Name:	AB_CFO.png
Views:	1
Size:	20.3 KB
ID:	1492436
                Click image for larger version

Name:	AB_PROD.png
Views:	1
Size:	20.4 KB
ID:	1492437
                Click image for larger version

Name:	DIS_EXP.png
Views:	1
Size:	20.6 KB
ID:	1492438
                Thank you in advance for any help you may provide me with!


                Comment


                • #38
                  Emmanuel,
                  It looks like you have many more observations (firm-years) than Roychowdhury. From his Table 1 on page 347, it looks like he has 21,758 firm-years compared to your minimum number of firm-years of 62,476 in the regressions above. You also say that you eliminated four years of the data so (1987- 1990) so I'm not sure why you have so many more observations than he does.

                  The second paragraph of page 344 talks about his process for eliminating firms from his sample. Did you follow these same procedures? He eliminates certain two-digit SIC codes, and it looks like he also requires 15 firm-years where your code looks like it requires 10.

                  Before I estimate models when replicating a paper, I try to see if the number of observations I have and the summary statistics for the variables line up with what the authors published. If I were you, I would see if I could get closer to Roychowdhury's number of observations in his sample and then compare the descriptive statistics you have to his Table 1.

                  Comment


                  • #39
                    Robson,

                    Thank you for your reply! After eliminating as many observations as I could, I got as close to 21,758 as possible. The signs and the magnitudes of the three important variables did appear to be close to those in Roychowdhury (2006), so at least it appears I am on the right track. I really appreciate your help.


                    Thanks once again!
                    Last edited by Emmanuel Sequeira; 11 Apr 2019, 23:54.

                    Comment


                    • #40
                      Happy to help, and glad it worked.Good luck with the rest of your project.

                      Comment


                      • #41
                        Originally posted by Ali Ahmed View Post
                        Now I am able to solve it. Thanks a lot!
                        Hi all,

                        I am trying to calculate the discretionary accrual just like Ali and got invalid sytax (r198). I wonder how to solve this. The followings is the syntax that I use.


                        egen CompanyName_n = group( CompanyName)
                        . egen GICS_n = group( GICS)
                        . vallist GICS_n
                        . local a =r(list)
                        . vallist Year
                        . local b =r(list)
                        . gen uhat=.
                        . xtset CompanyName_n Year
                        . gen obs= [_n]
                        . summ obs
                        . scalar e= r(min)
                        . scalar f= r(max)
                        . gen ta=(NIBE-CFO)/LTA
                        . gen x1=1/LTA
                        . gen x2=(Δrevenue-Δreceivable)/LTA
                        . gen x3=PPE/LTA
                        . foreach i in `a’ {
                        2. foreach x in `b’ {
                        3. forvalues j= `=scalar(e)’/`=scalar(f)’ {
                        4. capture noisily reg ta x1 x2 x3 if GICS_n ==`i’ & Year ==`x’ & obs != `j’,nocons
                        6. capture noisily predict GICS_n, resid
                        7. capture noisily replace GICS_n =. if e(N) < 8
                        8. capture noisily replace uhat= uhat_2 if GICS_n ==`i' & Year ==`x' & obs== `j'
                        9. capture noisily drop GICS_n
                        10. di `i'
                        11. di `x'
                        12. di `j'
                        13. }
                        14. }
                        15. }

                        Comment


                        • #42
                          Originally posted by Emmanuel Sequeira View Post
                          Robson,

                          Thank you for your reply! After eliminating as many observations as I could, I got as close to 21,758 as possible. The signs and the magnitudes of the three important variables did appear to be close to those in Roychowdhury (2006), so at least it appears I am on the right track. I really appreciate your help.


                          Thanks once again!
                          Which observation did you eliminate as now I have the same problem?

                          Comment

                          Working...
                          X