Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a new variable as a result of comparing the mean of a variable (t-test) in two different times

    Hi,

    I have three variables here in my example, as described below.
    1) EXP: is the time dummy variable indicating 1 if the time is from 2013 to 2015 and, Zero otherwise (2016 to 2018).
    2) Abs_DACC: is a variable representing companies earnings management.
    3) firmid: is the company's identity number

    I want to generate a new (dummy) variable indicating 1 if the average value of the variable Abs_DACC is significantly more (could be t-test) in the period of Depost == 1 than the period of Depost == 0, and Zero otherwise.



    Below is the example from my sample.

    I would be grateful if one could guide me in generating my new dummy variable.

    Best regards,
    Mahmoud

    example

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str14 firmid float(Abs_DACC EXP)
    "SE0000101362"  .0043683364 1
    "SE0000101362"    .04054667 1
    "SE0000101362"   .016035259 1
    "SE0000101362"  .0021590118 0
    "SE0000101362"  .0004436661 0
    "SE0000101362"   .025840644 0
    "SE0000103699"   .007250534 1
    "SE0000103699" .00024572795 1
    "SE0000103699"   .011364168 1
    "SE0000103699"   .015171934 0
    "SE0000103699"   .005706637 0
    "SE0000103699"   .015331745 0
    "SE0000103814"   .007773208 1
    "SE0000103814"   .020582644 1
    "SE0000103814"  .0010111552 1
    "SE0000103814"    .05279258 0
    "SE0000103814"   .022170475 0
    "SE0000103814"   .006775594 0
    "SE0000105199"   .010294355 1
    "SE0000105199"   .020721633 0
    "SE0000105199"    .03075206 0
    "SE0000105199"     .0771719 0
    "SE0000105264"    .05795591 1
    "SE0000105264"    .19086185 1
    "SE0000107724"     .0830186 1
    "SE0000107724"    .01778689 0
    "SE0000107724"   .003614869 0
    "SE0000107724"   .070517756 0
    "SE0000108227"    .03772058 1
    "SE0000108227"    .02138011 1
    "SE0000108227"   .013683626 1
    "SE0000108227"   .005922483 0
    "SE0000108227"    .05475985 0
    "SE0000108227"    .00685365 0
    "SE0000108656"   .032462087 1
    "SE0000108656"    .03074585 1
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 36 out of 781 observations

    .

  • #2
    Code:
    capture program drop one_firm
    program define one_firm 
        capture ttest Abs_DACC, by(EXP)
        if c(rc) == 420 {   // ONLY ONE GROUP
            gen byte sig_diff = .
        }
        else if c(rc) == 0 {
            gen byte sig_diff = (r(p) < 0.05)
        }
        exit
    end
    
    runby one_firm, by(firmid)
    will do this. It requires the -runby- program, written by Robert Picard and me, available from SSC.

    The code recognizes that, at least in your example data, there are some firms that have data on Abs_DACC only for EXP = 0 or only for EXP = 1 and it allows those situations, returning a missing value for the result, without throwing an error.

    All of that said, you shouldn't do this at all. The American Statistical Association has recommended discarding the concept of statistical significance. See https://www.tandfonline.com/doi/full...5.2019.1583913. You should instead identify some criterion for the difference in mean Abs_DACC that is more meaningful and useful.

    Comment


    • #3
      Clyde:
      Thank you so much for the code and the reference!
      In fact, I am not concluding based on t-test. I am just trying to find companies that are suspected and unsuspected to earnings management. I have also alternative measures to control it. Both suspect or unsuspect groups are included in the sample and being tested in a regression model. I understand your point saying that it is not accurate to say that 1 in the new variable -sig_diff- shows significant differences because the level of p-value can be different for each firm. Indeed, you well pointed out that 1 or 0 is not a clear and good representative of p-value here.

      Thanks again for the comment.

      Kind regards,
      Mahmoud

      Comment

      Working...
      X