Why does analyses result in different results depending on whether you use i.variable or omit the i. ?
Example:
Code:
regress score i.city i.sex i.smoking
1.city, p=0.628
2. city, p=0.013
3.city, p<0.0001
sex, p=0.015
smoking, p=0.003
Code:
regress score city sex smoking
city, p<0.0001
sex, p=0.072
smoking, p=0.007
My understanding: when not using the prefix i. you get an "overall" p-value. However, if that was the case, I don't understand why the p-values are as different as they are: sex is only statistically significant (p<0.05) when using the prefix i., and two out of three cities are significant when using the prefix, but p is very low (p<0.0001) when not using the prefix.
Follow-up question: should you always specify i. before categorical variables, or when should you do it?
Comment