How to apply partially overlapping samples t-test

Allen Smith

Join Date: May 2020

Posts: 10
#1

How to apply partially overlapping samples t-test

03 May 2020, 08:43

Hi all,

Assuming I have a sample that includes 10 stocks (ID1 to ID 10) and two types of traders (Foreign=0 and Domestic=1 ) as showing following.
ID Type Closing price

1 1 4

1 0 4

2 1 1

2 0 1

3 1 6

4 1 7

5 1 7

6 0 8

6 1 8

7 0 6

7 1 6

8 1 8

9 0 5

10 0 4

If I want to test whether the mean closing price for foreign traders is statistically different from the mean price for domestic traders, is the following approach is correct?. This approach is from: https://www.statalist.org/forums/for...lapping-groups . If this approach is not appropriate for my sample, could anyone give me some suggestions?

Code:

.svyset _n . svy: regress close_price if type==0 . estimates store eq1 . svy: regress close_price if type==1 . estimates store eq2 . suest eq1 eq2 . lincom [eq1]_cons - [eq2]_cons, noci
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35724
#2

03 May 2020, 08:47

I don't understand the question. Your groups don't overlap at all: they are honest-to-goodness disjoint as type can't be both 0 and 1. So, there is no call for separate models at all. Whether a regression for your price data is the best analysis I can't advise on.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#3

03 May 2020, 11:41

I don't understand how one ID (there are several but id 1 is an example) can be both type 0 and type 1 (at the same time????); please clarify
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#4

03 May 2020, 12:33

Rich Goldstein has a good question. I note that your code in #1 ignores ID which I guess was my unconscious reason for ignoring it too.
Comment
Allen Smith

Join Date: May 2020

Posts: 10
#5

03 May 2020, 13:11

Originally posted by Nick Cox View Post

Rich Goldstein has a good question. I note that your code in #1 ignores ID which I guess was my unconscious reason for ignoring it too.

Hi, ID refer to the Stock ID. For example Stock 1 is traded by both domestic and foreign traders (Type=0 refers to the stock traded by foreign traders and Type=1 refers to the stock traded by domestic traders).

With t-test, I want to find whether stocks traded by domestic traders has a higher closing price than stocks traded by foreign traders.

So, do I overthink it? There is no overlapping. I can directly use t-test?
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#6

03 May 2020, 13:21

I guess you need to exclude stocks traded only by one kind of trader. And I don't know what complexities you aren't showing us, such as dates and amounts.

Last edited by Nick Cox; 03 May 2020, 13:26.
Comment
Allen Smith

Join Date: May 2020

Posts: 10
#7

03 May 2020, 14:44

Originally posted by Nick Cox View Post

I guess you need to exclude stocks traded only by one kind of trader. And I don't know what complexities you aren't showing us, such as dates and amounts.

Many thinks for your reply. But I think my interpretation may be unclear. Let me reword my question.

Assuming I have a dataset that includes the daily price of 100 stocks over 10 years. Additionally, this dataset also includes the information of traders that allows me to divided traders into domestic and foreign traders.

And then, I want to conduct a t-test to find whether the stocks traded by domestic traders is statistically different from these traded by foreign traders.

Since some stocks are traded by other types of traders, these stocks are included when calculating mean values of two types of traders.

For example:
The group of Domestic traders includes Stock 1, 2 ,4 , 5, 6, 8, 10.
The group of Foreign traders includes Stock 1, 2, 3, 6, 7, 9, 10.
Stock 1, 2, 6, 10 are included in both groups.

In this context, can I directly use two-sample t-test? or I should consider the two groups are overlapping and use other t-test.

I think two-sample t-test may be not suitable. And here is results of ttest.

Code:

.ttest price, by (type) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0 | 56,367 773676.6 9668.922 2295570 754725.5 792627.8 1 | 73,372 893691 7984.08 2162669 878042.2 909339.8 ---------+-------------------------------------------------------------------- combined | 129,739 841549 6169.399 2222175 829457.1 853640.9 ---------+-------------------------------------------------------------------- diff | -120014.4 12441.76 -144400 -95628.75 ------------------------------------------------------------------------------ diff = mean(0) - mean(1) t = -9.6461 Ho: diff = 0 degrees of freedom = 129737 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35724
#8

03 May 2020, 15:55

#6 is already flagged as my guess. Otherwise I doubt that this is in practice much use as serial dependence of prices is a real complication.
Comment

ID	Type	Closing price
1	1	4
1	0	4
2	1	1
2	0	1
3	1	6
4	1	7
5	1	7
6	0	8
6	1	8
7	0	6
7	1	6
8	1	8
9	0	5
10	0	4

Announcement

How to apply partially overlapping samples t-test

Comment

Comment

Comment

Comment

Comment

Comment

Comment