Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression analysis with categorical independent variables

    Hey,

    with your help I could change a variable with numeric values (e.g. age or income) to a categorical variable. The question now is if I can use these created categorical variables directly for regression analysis. As my dependent variable is continuous I can still use a linear regression model. But du I have to create k-1 dummy variables from the categorical variable or does it work without this step?


  • #2
    Stata will create the k-1 dummies on the fly if you use factor variables by specifying i.varname in your regression. Type help factor variables to learn more.

    Comment


    • #3
      I agree with German, provided your categorical variable is properly coded, then i.varname should work for you. But the caveat is that, you should know which of the category you want as you base category and code it as 0 or minimum value otherwise stata will automatically choose one for. The worrying part might be when stata chooses a base category contray to the one you like. Normally stata by default chooses base-categories in order of ascendancy. Thus if one of your categories is coded as zero, stata picks up that one as the base category. if no category is coded as 0. then stata moves on coded as 1, 2 etc.

      Comment


      • #4
        It is not necessary to recode the categorical variable should you want to use another category as the base category instead of the first category (i.e lowest coded category). E.g. instead of coding i.stanine, which uses 1 as the base category, you code b5.stanine to use stanine 5 as the base category. The b codes Stata to use as the 'base' the fifth category (5).
        http://publicationslist.org/eric.melse

        Comment


        • #5
          Simon:
          as an aside to previous helpful comment, please note the risk of dichotomizing continuous predictors at https://www.ncbi.nlm.nih.gov/pubmed/16217841.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thank you all a lot, I appreciate your help!
            I know that there are disadvantages of dichotomizing continuous predictors, but I actually don't have another possibility. Unfortunately we have a relatively small dataset and because of that I think I get more information out of categorizing for example income in 3 or 3 categories.

            Comment

            Working...
            X