Testing for selection bias

Elvire Landstra

Join Date: Jul 2019

Posts: 4
#1

Testing for selection bias

22 Jul 2019, 03:11

Hello everyone,

I was wondering if anyone knows how to test for differences in characteristics for people that dropped out of the study and people that remained? I know that you should use a chi squared for categorical/binary variables and a t-test for continuous, but when I try to test the group that did not drop out with the group that dropped out using tab ...., chi, stata says there are no observations? Even though there should be 7000. Hopefully, someone can help me with this.
Thank you.

Elvire Landstra

Last edited by Elvire Landstra; 22 Jul 2019, 03:18.
Tags: categorical, chi square test
Mike Lacy

Join Date: Apr 2014

Posts: 2421
#2

22 Jul 2019, 07:26

Without knowing exactly what you typed and something of the nature of your data set, giving a good answer to your question is not possible. (I'd encourage to take a look at items 12.1 and 12.2 in the FAQ.) My guess would be that you might not have included the -missing- option on your tab command, among some other possible problems. More information would enable us to give a more helpful answer.

That being said: I don't think that hypothesis tests are likely to give you a helpful answer here, despite that many people think it's a good approach in your situation. Selection biases can cause problems whether or not "significant" differences exist. In particular, for a large sample such as you apparently have, hypothesis tests will likely indicate differences in distribution with corresponding small p-values, but magnitudes of difference that are too small to matter.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17730
#3

22 Jul 2019, 07:27

Elvire:
by default Stata omits observations with missing values in any variables.
Maybe you may want to take a look at the comminuity-contributed command -mcartest- (just type -search mcartest- from within Stata to spot it).
PS: crossed in the cyberspace with Mike's helpful reply.

Kind regards,
Carlo
(Stata 19.0)
Comment
Elvire Landstra

Join Date: Jul 2019

Posts: 4
#4

23 Jul 2019, 03:52

Dear Carlo and Mike,

Thank you for your responses!
A bit more information on my dataset: it is a dataset with a total of 8344 people. I already excluded all people with missing data in any of my covariates, outcomes or exposures (leaving me with 8003 people). I would like to test whether people that stayed the whole study period (12 years) were different in terms of exposures and covariates than those that stayed. I created a variable samp, which is basically whether people stayed or not (0= dropped out 1= stayed the whole 12 years).
When performing a ttest for the continuous covariates, I am able to simply type ttest age, by(samp) and then it gives me the mean and the p-value (giving some indication of selection bias).
After typing in missing after the tab chi command, I did indeed get a chi square test! Thank you for your help in solving this issue!

I will also check out the mcartest, thank you for that suggestion! I was wondering how mcartest is different from the chi command, however. Is it more reliable or precise? I have never heard of the command before.

Again, thank you for responding, it has already been of great help!

Elvire
Comment

Announcement

Testing for selection bias

Comment

Comment

Comment