Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expected and Unexpected Results using out of sample predictions

    I am trying to replicate a study. The study calls for a simultaneous expectation model from a smaller sample that has all the variables needed for computation. The coefficients from this are then used for the main sample to compute the expected components of audit and non-audit. The study states that, although the unexpected fee numbers are zero by construction in the smaller sample used to estimating the coefficients, they are not zero in the sample used for our analysis, which explains the non-zero values. I have obtained means for audit and non-audit fees for my replication that are similar to those in the study. However, I am not sure of a few things.

    First, I am getting very negative values after I perform the two ivregress functions used to determine the coefficients of simultaneous expectation model. However, the descriptive statistics of the study contain no negative values and also there are no negative audit-fees or non-auditfees in the sample.

    Second, I am not sure how valid such an out-of sample use of the coefficients is since, if data happens to be missing, that coefficient will not be used for the the calculation of the expected audit and non audit fees. This, I'm assuming, would create invalid results since values used in the determination of the fees would be missing.

    Third, the model in the study uses the logged values of auditfees and nonauditfees as the dependent variables. But, for the purposes of trying to replicate the descriptive statistics, I used the pre-logged values. I am not sure if this method is okay.

    I consulted the ivregress manual to determine how to perform the simultaneous expectation model, which led me to including only the one variable (lag and finacq) as the instrumental variable of the other model that each of the equations did not share with eachother. However I saw another example online that used all the unique variables from both of the equations as instrumental variables inside both ivregress commands. The commands I used for the simultaneous equation model are shown below. The variables from aero to whlsl are industry dummies.

    ivregress 2sls auditfees pwc deloitte kpmg ey size qr invrec roa auditor_change yrend debtta yrdum cfo mb spidum qual change_in_revenues bankrupt lag aero agric autos banks beer bldmt books boxes bussv chems chips clths cnstr coal comps drugs elecq fabpr fin food fun gold guns hlth hshld insur labeq mach meals medeq mines oil other paper persv rlest rtail rubbr ships smoke soda steel telcm toys trans txtls util whlsl (lnaf = finacq pwc deloitte kpmg ey size qr invrec roa auditor_change yrend debtta yrdum cfo mb spidum qual change_in_revenues bankrupt lag aero agric autos banks beer bldmt books boxes bussv chems chips clths cnstr coal comps drugs elecq fabpr fin food fun gold guns hlth hshld insur labeq mach meals medeq mines oil other paper persv rlest rtail rubbr ships smoke soda steel telcm toys trans txtls util whlsl)ivregress 2sls nonauditfees pwc deloitte kpmg ey size qr invrec roa auditor_change yrend debtta yrdum cfo mb spidum qual change_in_revenues bankrupt finacq aero agric autos banks beer bldmt books boxes bussv chems chips clths cnstr coal comps drugs elecq fabpr fin food fun gold guns hlth hshld insur labeq mach meals medeq mines oil other paper persv rlest rtail rubbr ships smoke soda steel telcm toys trans txtls util whlsl (laf = lag pwc deloitte kpmg ey size qr invrec roa auditor_change yrend debtta yrdum cfo mb spidum qual change_in_revenues bankrupt finacq aero agric autos banks beer bldmt books boxes bussv chems chips clths cnstr coal comps drugs elecq fabpr fin food fun gold guns hlth hshld insur labeq mach meals medeq mines oil other paper persv rlest rtail rubbr ships smoke soda steel telcm toys trans txtls util whlsl)

  • #2
    Michael:
    your query set up is unlike to gen helppful replies (too many words).
    Please read the FAQ on how to post more effectively and FAQ #12 on how to post what you typed and what Stata gave you back. Thanks..
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      I just posted what I typed though. I gave the commands. The results are unlikely to be of any further help since I cant post the study here to compare it to. Basically I am trying to get the predicted results of the dependent variables auditfees and nonauditfees. After that I need to get the unexpected results by comparing the realized results with the predicted results of the the two variables. But first, when trying to determine the expected results using the predict command, I have am obtaining strange minimum values for my results, which the study does not contain.

      Comment


      • #4
        I have to agree with Carlo's advice.

        In addition, even if you have a massive sample size, that's an extraordinarily complicated model to fit, allowing for the fact that it's just one linear regression with some bells and whistles. I would advise looking very carefully at diagnostic plots as well as the usual coefficient estimates, standard errors, etc.

        Comment

        Working...
        X