Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • which variable to take as reference category?

    Hi all,

    I'm doing a research about the relationship between risk tolerance and demographic factors such as age, generation, survey year, and control variables (including education levels,...).
    1. I have already categorized respondents into 5 different groups ( loweduc, training, university, college, othereduc) but I'm not sure which one to take as the reference category.
    2. The data collected is from 1993 to 2019 and I'm doing the test to see if survey years affect their risk tolerance by adding event years and periods of time in the regression. For example, I want to see if 2001 and 2008 crisis affect their responses, and how they reacted in the next 4-year period. So I included 01, 08, 02-05, 09-12 to my model but I have no idea what is the reference category to interpret those dummies.

  • #2
    It does not matter, they are all mathematically equivalent models. So just pick whatever category you find most convenient. I tend to avoid categories with very few observations in them. Also look at help contrast for alternative ways of presenting the results. Just to repeat, none of this really matters: all these are just different ways of saying the exact same thing. So feel free to look at all of them, reassure yourself that the results mean the exact same thing, and pick one for presentation that you find easiest to explain.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      But sir, in my 2nd question, I didn't include every survey years in the model so there's no baseline to interpret dummies. Or I just use the first survey year as reference category?

      Comment


      • #4
        You do not have a unique reference year because your reference year is a kitchen sink: it's a collection of all years other than "01, 08, 02-05, 09-12", that is, 1993, 1994, 1995, 1996, and so on until you enumerate every year that is not captured by your dummies.

        Comment


        • #5
          Hong Il Yoo do you have any suggestion for me? I just want to see the effects of those years and periods so I dont want to include every survey years in the model because it's gonna be too long

          Comment


          • #6
            I'm afraid that I don't know enough about your research to say anything useful. As a very general suggestion, I'd like to say that what you call "the effects of those years" are always defined relative to some reference category. You should choose a reference category that you find useful or appropriate in some sense. The reference category doesn't have to be a single year.

            Comment


            • #7
              Originally posted by Lim Duong View Post
              I didn't include every survey years in the model so there's no baseline to interpret dummies. Or I just use the first survey year as reference category?
              All the years that aren't represented by an indicator (dummy) variable are grouped together in one category, and this is your reference category. This is probably not what you want. So, you'll probably have no choice but to include all the years (minus the reference of course).

              Also, I hope your are using Stata's factor variables for this. If not see help fvvarlist

              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment

              Working...
              X