Relevant test (t-test/z-test etc.)

Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#1

Relevant test (t-test/z-test etc.)

02 May 2019, 09:52

Dear All,

I want to ask you what is the most appropriate test for testing significance in difference in means.

I look at selected key measures for firms on a period from 2013-2017 in one sector in a specific country. The firms are sorted in two groups: small cap and large cap.

The small cap group consists of 10 firms. The large cap consists of 17 firms.

When for instance looking at the ratio - return on equity - i get a mean of 5.2 percent for the small cap group and 10.1 percent for the large cap group.

What is the appropriate test to test for significance in the difference in the mean (10.1-5.2).

Number of observations for the small cap group is 50 (5 years times return on equity for 10 firms). The number of observations for the large cap group is 85 (5 years times return on equity for 17 firms).

Is this still a sample? Because it is 5 years only? Also what test should I use? The sample number is also different.

If anybody have an idea eventually a reference to a lecture book, it would be of great help.

Thank you in advance.

Best,
Anders
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

02 May 2019, 11:02

Anders:
try:

Code:

ttest ROE, by(firm_group) unequal

However, if you have panel data, you may be interested in -xtreg-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#3

02 May 2019, 11:45

Thanks for your answer, Carlo.

Would that be a t-test on unequal variances?

And how do I determine if it is panel data?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

02 May 2019, 12:01

Anders:
1) yes.
2) If you have the same units repeatedly measured across years at equally spaced intervals (or approximately so) on the same variables, then you have a panel dataset.

Kind regards,
Carlo
(Stata 19.0)
Comment
Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#5

02 May 2019, 12:52

Dear Carlo,

Thank you very much for your answer.

My data looks like this:

Small cap group (ROE)

Company 1 22,1 (2013) 5,2 (2014) 3,2 (2015) 16,2 (2016) -7,1 (2017)
Company 2 3,5 (2013)-12,5 (2014) 5,2 (2015) 5,1 (2016) 29,2 (2017)
Company 3 -5,1 (2013) -2,5 (2014) 7,8 (2015) 10,2 (2016) 39,1 (2017)
Company 4 -77,5 (2013) 9,8 (2014) 10,0 (2015) 0,1 (2016)
etc...
Company 10 3,2 (2013) 22,3 (2014) 10,1 (2015) 12,0 (2016) -2,5 (2017)

Large cap group (ROE)

Company 1 1,2 (2013) -25,2 (2014) 0,4 (2015) 6,2 (2016) -3,1 (2017)
Company 2 3,9 (2013) 2,5 (2014) 3,0 (2015) 1,8 (2016) 59,2 (2017)
Company 3 -1,2 (2013) 0,5 (2014) 74,8 (2015) 30,2 (2016) 29,1 (2017)
Company 4 1,2 (2013)
etc...
Company 17 32,1 (2013) 2,3 (2014) 0,1 (2015) 12,1 (2016) -0,5 (2017)

Can this be classified as panel data? The same firm does not necesarily exist in all years. So it's not as the same units are being measured across years?

Best,
Anders

Last edited by Anders Svejgaard; 02 May 2019, 12:58.
Comment
Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#6

02 May 2019, 13:19

So I understand it is one should run a t-test (two Means, independent samples, unknown population variances not assumed to be equal). And then test for the two-sided. That is H0 is that the difference in the sample means are equal to zero, and H1 is that the difference is different from zero.

Would that be the same as running an unequal variances t-test in Excel on the data sorted in two columns (85 rows and 50 rows) and looking at the P-val for the two-sided alternative. Then if below 0.05, we reject H0, and the difference thus is significant different from zero?
Comment
Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#7

02 May 2019, 13:54

But would it be best, first to do a F-test to test for unequal variances?
Comment
Anders Svejgaard

Join Date: Nov 2018

Posts: 9
#8

02 May 2019, 15:57

Would a Mann-Whitney U test be better if I would test for the difference in medians instead?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#9

02 May 2019, 23:58

Anders:
1) please note that Stata jargon differs from the one used in spreadsheets: in Stata, rows and columns are observations and variables, respectively.
2) regardless the fact that not all the companies are reported across all years, you still have a panel dataset (see Tecnhnical note after Example 2, -xtset- entry, Stata .pdf manual).
3) your intuition about two-sided -ttest- is correct;
4) for testing medians in Stata, see -help ranksum-.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Relevant test (t-test/z-test etc.)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment