Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need Help with Negative Binomial Regression Model please

    I’m presently engaged in researching my thesis, focusing on the connection between poor mental well-being and economic progress, specifically examining its impact through the lens of absenteeism. Absenteeism, which indicates the total number of days the respondent was absent from work due to health-related issues in the last 12 months, is my dependent variable.

    This is the regression that I built:

    ABSENT = β0 + β1Depressed + β2FEMALE + β3AGE + β4Pre-Teriatary + β5Teriatary + β6MARRIED + β7PN0 + β8HHNBPERS13CS17 + β9HINCOME + β10FT_PTCS19 + ԑ

    AW2 is the number of absenteeism days in past year.
    β0 is the intercept.
    Depressed is a poor mental health indicator.
    Female is the gender.
    AGE is the age group that the respondent falls into.
    Pre-tertiary and Tertiary is the level of education.
    MARRIED is the marital status.
    PN0 is the physical pain suffered by the individual.
    HHNBPERS13CS17 is the number of children aged 13 years or less residing in the household.
    HINCOME is the household’s total net income per month.
    FT_PTCS19 is a dummy variable 1 - Full time; 0 - otherwise.
    ԑ is the error term.

    My data is cross-sectional and case-based and due to a significant proportion of zero entries (i.e., zero absent days) I have an overdispersion. I am using STATA 13.

    I am new to this and this is my first time using Negative Binomial Regression Model. Is this a good model? Can you provide me with your feedback and suggestions? Also I am not sure how to interpret the results correctly. Below you can find the model. Thank you so much for your help in advance.
    Click image for larger version

Name:	Pic1.png
Views:	1
Size:	70.6 KB
ID:	1725695
    Click image for larger version

Name:	Pic2.png
Views:	1
Size:	36.8 KB
ID:	1725696
    Attached Files

  • #2
    Your reason for choosing a negative binomial model (a lot of 0s in the count outcome) is sound. margins works best when you declare the independent variables as continuous or categorical. You can do so by placing a c. prefix in front of continuous variables and an i. prefix in front of categorical variables. Another consideration is whether you want to transform the income variable. Often, when there is great spread in income such that a small number of respondents have very large incomes while most have "normal" income, researchers will take the logarithm of income. This is especially true when income is the dependent variable but can also be the case when it is an independent variable. You should consult with your advisor about conventions in your field.

    Comment


    • #3
      Erik Ruzek Thank you for your feedback. Do margins work when having most of the variables as dummies? Can you explain how can I transform the income variable, please? The data that I have for the income variable was acquired from a questionnaire where the respondents had to choose where they fall within the following income brackets; ‘€0-€599’, ‘€600-€800’, ‘€801-€1,099’, ‘€1,100-€1,500’, ‘€1,501+’. So I do not have a specific amount for each person. And I turned this into a dummy variable. Or should I do it as categorical and if yes why? Thank you so much.

      Comment


      • #4
        Erik Ruzek Also, can you explain to me how can identify which is the best model to use when utilizing the negative binomial regression model? And how can one interpret the Wald Chi2 (10) please?

        Comment

        Working...
        X