I have the variable age which I want to categorize, but I'm not fully sure which version of the categorization I should use. I choose these 2 versions based on past research and what was used in them.

Age is continuous on the interval [15 - 90]

I categorize this as:

Age_cat_1:

Less than 20 - young

20- 50 = old

> 50 = really old

Age_cat_2:

Less than 40 - young

> 40 = old

Then my main analysis is whether age has an association with math scores (continuous variable), after factoring for differences in sex, high school and English scores

Under model 1:

ologit Age_cat_1 Mathscores i.sex i.high_school English

Under model 2:

ologit Age_cat_2 Mathscores i.sex i.high_school English

Let's say both models are statistically significant - i.e. math scores are clearly statistically significant predictors of age.

Now how do I determine which model to use for the age categorization?

Age is continuous on the interval [15 - 90]

I categorize this as:

Age_cat_1:

Less than 20 - young

20- 50 = old

> 50 = really old

Age_cat_2:

Less than 40 - young

> 40 = old

Then my main analysis is whether age has an association with math scores (continuous variable), after factoring for differences in sex, high school and English scores

Under model 1:

ologit Age_cat_1 Mathscores i.sex i.high_school English

Under model 2:

ologit Age_cat_2 Mathscores i.sex i.high_school English

Let's say both models are statistically significant - i.e. math scores are clearly statistically significant predictors of age.

Now how do I determine which model to use for the age categorization?

## Comment