Hello,
I do a lot of health research with large data sets (e.g., Canadian Community Health Survey, General Social Survey). One of the outcomes I am interested in is depression, specifically persons' scores on a short-form depression scale. However, this DV is extremely skewed because most people do not suffer from depression.
I was planning to analyze the data with -regress-, but given the extreme skew of the outcome variable I was curious if this was actually appropriate. I had read through a few postings about this topic, but they seemed to focus on the assumption of homoscedasticity, which would be less relevant as probability weighted data uses HC1 for standard errors.
Any insight would be helpful!
Cheers,
David.
I do a lot of health research with large data sets (e.g., Canadian Community Health Survey, General Social Survey). One of the outcomes I am interested in is depression, specifically persons' scores on a short-form depression scale. However, this DV is extremely skewed because most people do not suffer from depression.
Code:
Depr. scale - | short form | score - (D) | Freq. Percent Cum. ---------------+----------------------------------- 0 | 21,432 91.47 91.47 1 | 33 0.14 91.61 2 | 100 0.43 92.04 3 | 200 0.85 92.89 4 | 307 1.31 94.20 5 | 381 1.63 95.83 6 | 510 2.18 98.01 7 | 347 1.48 99.49 8 | 120 0.51 100.00 ---------------+----------------------------------- Total | 23,430 100.00
Any insight would be helpful!
Cheers,
David.
Comment