Hello all,
I am writing my accounting thesis and I have to run some regressions in Stata and I can use some help.
I am looking whether there is a difference between growth in revenue and the growth in non-financial measures (such as the number of employees) between fraud and non-fraud firms, comparing the year prior to the fraud and the fraud year.
I have compiled a sample of firms whom overstated their revenue in certain years and matched each firm with a non-fraudulent competitor (each firm has a unique firm identifier (a gvkey)).
I require:
- the descriptive statistics of my sample, differentiating between fraud and non-fraud firms.
- A univariate analyses that tests the differences in means between the two groups
- A correlation matrix, comparing all control variables with a measure which is called 'capacity diff' and measures: revenue growth - non-financial measure growth
- A multivariate regression which looks as follows:
P(fraud) = B0 + B1 Capacity Diff + Bi Control variables
Where P(fraud) denotes a dummy variable coded 1 for fraud firms and 0 for non-fraud firms.
I don't know much about Stata or how it works, I am willing though to read/google a lot, however, I thought I could make a separate topic for this one.
What I was wondering is how to run such regressions comparing the differences between the matched pairs (It is some sort of matched pair sample regression), so how do I tell Stata which fraud firm belongs to which non-fraud competitor? (If possible/necessary I can edit the data by hand since I only have 30 pairs) + how to tell which control variable belongs to the fraud firms and which to the non-fraud firms (variables such as leverage ratio, altman Z-score etc).
I know that since my dependent variable is either 1 or 0 (fraud or non-fraud) I need a probit/logit regression, but how to choose between the two?
And finally, can I just run something like:
[Code]
probit/logit P(fraud) Capacity Diff Leverage Altman Z etc? Or isn't it that simple?
Any help is much appreciated!
Thomas
I am writing my accounting thesis and I have to run some regressions in Stata and I can use some help.
I am looking whether there is a difference between growth in revenue and the growth in non-financial measures (such as the number of employees) between fraud and non-fraud firms, comparing the year prior to the fraud and the fraud year.
I have compiled a sample of firms whom overstated their revenue in certain years and matched each firm with a non-fraudulent competitor (each firm has a unique firm identifier (a gvkey)).
I require:
- the descriptive statistics of my sample, differentiating between fraud and non-fraud firms.
- A univariate analyses that tests the differences in means between the two groups
- A correlation matrix, comparing all control variables with a measure which is called 'capacity diff' and measures: revenue growth - non-financial measure growth
- A multivariate regression which looks as follows:
P(fraud) = B0 + B1 Capacity Diff + Bi Control variables
Where P(fraud) denotes a dummy variable coded 1 for fraud firms and 0 for non-fraud firms.
I don't know much about Stata or how it works, I am willing though to read/google a lot, however, I thought I could make a separate topic for this one.
What I was wondering is how to run such regressions comparing the differences between the matched pairs (It is some sort of matched pair sample regression), so how do I tell Stata which fraud firm belongs to which non-fraud competitor? (If possible/necessary I can edit the data by hand since I only have 30 pairs) + how to tell which control variable belongs to the fraud firms and which to the non-fraud firms (variables such as leverage ratio, altman Z-score etc).
I know that since my dependent variable is either 1 or 0 (fraud or non-fraud) I need a probit/logit regression, but how to choose between the two?
And finally, can I just run something like:
[Code]
probit/logit P(fraud) Capacity Diff Leverage Altman Z etc? Or isn't it that simple?
Any help is much appreciated!
Thomas
Comment