Hello,
My question is in regard to a logistic regression with the outcome being purchasing of a food group (0=no, 1=yes). There are binary, categorical and continuous independent variables. Two of the continuous IVs are measures of age (AGE_REF) and income (FWAGEXM). To improve interpretability of the odds ratios for these measures, I decided to log transform them using log base 2 for age (variable name is now log2AGE_REF) and log base 10 for income (now log10FWAGEXM). Upon creating log10FWAGEXM, 912 missing values were generated.
I have read that log transforming continuous variables in a logistic regression model should not drastically change the p-values for those IVs. This seems to be the case for the age variable (log2AGE_REF), however the p-value for log10FWAGEXM changed from .001 to .526. Additionally, the odds ratios for the age IV is now more interpretable, however, the odds ratio for the income variable has not changed significantly.
I'm attaching log file that shows two outputs. The first includes the original age and income variables and the second includes the transformed variables. I'm also including the commands I used to transform.
Am I missing something here? Or, is it normal to have p-values change so dramatically? Is this happening because of the 912 missing values that were generated?
Commands:
generate log2AGE_REF =log(AGE_REF) / log(2)
generate log10FWAGEXM = log10(FWAGEXM)
Log Transformations.smcl
Thanks,
Ryan
My question is in regard to a logistic regression with the outcome being purchasing of a food group (0=no, 1=yes). There are binary, categorical and continuous independent variables. Two of the continuous IVs are measures of age (AGE_REF) and income (FWAGEXM). To improve interpretability of the odds ratios for these measures, I decided to log transform them using log base 2 for age (variable name is now log2AGE_REF) and log base 10 for income (now log10FWAGEXM). Upon creating log10FWAGEXM, 912 missing values were generated.
I have read that log transforming continuous variables in a logistic regression model should not drastically change the p-values for those IVs. This seems to be the case for the age variable (log2AGE_REF), however the p-value for log10FWAGEXM changed from .001 to .526. Additionally, the odds ratios for the age IV is now more interpretable, however, the odds ratio for the income variable has not changed significantly.
I'm attaching log file that shows two outputs. The first includes the original age and income variables and the second includes the transformed variables. I'm also including the commands I used to transform.
Am I missing something here? Or, is it normal to have p-values change so dramatically? Is this happening because of the 912 missing values that were generated?
Commands:
generate log2AGE_REF =log(AGE_REF) / log(2)
generate log10FWAGEXM = log10(FWAGEXM)
Log Transformations.smcl
Thanks,
Ryan
Comment