Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brant test error: parallel regression assumption has been violated

    Hi all,

    In regards to a task needed to be done for my degree, I need to finish this task. My research Q is following: What influences household income? My 2 explanatory variables are: education and alcohol consumption. Other controlvariables will be added later to check this

    All the variables are following:
    - dependent variable: income_pct_w5 (changed to 'inkomen')
    - independent variables: br010_mod (changed to 'alcoholconsumptie') and isced1997_r (changed to 'educatie')
    - control variables: age, mar_stat, female, hhsize and ep005_

    What can I do/change to not have the brant error? Do I need to change the ologit to just a mlogit and work from there? But it seems my dependent variable, household income (in 3 groups) are ordinal.

    my database comes from http://share-project.org/ questionnaires wave 5 for country==14
    use "easySHARE_rel8-0-0.dta", clear
    my code is following:

    drop if country!=14
    drop if wave!=5

    recode income_pct_w5 (1/2=1) (3/8=2) (9/10=3), gen(inkomen)
    label define inkomen1 1 "Laag" 2 "Gemiddeld" 3 "Hoog"
    label values inkomen inkomen1

    recode isced1997_r (0/3=1) (4/6=2) , gen(educatie)
    label define educatie1 1 "Geen tot secundaire edu" 2 "Hogere edu"
    label values educatie educatie1


    recode br010_mod (1=1) (2/4=2) (5/7=3), gen(alcoholconsumptie)
    label define alcoholconsumptie1 1 "Geen" 2 "Minder dan twee maal per week" 3 "Meer dan twee maal per week"
    label values alcoholconsumptie alcoholconsumptie1

    recode age (30/59.99=50) (60/69.99=60) (70/79.99=70) (80/99.99=90), gen(leeftijd)
    label define leeftijd1 50 "30-50'ers" 60 "60'ers" 70 "70'ers" 90 "80-90'ers"
    label values leeftijd leeftijd1

    mvdecode educatie alcoholconsumptie, mv(-15/-12)
    mvdecode educatie, mv(97)
    drop if educatie == .
    drop if alcoholconsumptie ==.
    ologit inkomen ib(first).educatie ib(first).alcoholconsumptie ib(last).leeftijd, baselevel
    brant, details

    Thanks in advance,
    Max
    Last edited by Max Janssens; 20 Dec 2022, 09:28.

  • #2
    There are so many red flags, that if I were your educator I would pull the emergency break. I would forbid you to open Stata. I would force you to write a paper that fixes all the red flags. Only after I am satisfied, will I allow you to open Stata again.

    Here are the red flags that need to be fixed first:
    1. your research question is way to general. Think about what the answer to that question would be: an exhaustive list of all things that influence income. I don't need to do research to tell you what that is: it is everything. Done. But also: useless. So you need to put a lot more work in your research question.
    2. Where did alcohol consumption come from? If it is that important to your project, then it needs to be in your research question.
    3. Why use SHARE? This is a panel focussing on elderly (defined as 50+, which irritates/confronts some of my collegues...). Whatever you find there is not generalizable to the general population. (there are a few younger people in there, as it is a household panel, and some people have young partners, but these are in no way representative of all young people)
    4. Why use the 5th wave? Panels are great, but they suffer from attrition: over waves more and more people will no longer participate. As a consequence later waves will be less and less representative of the target population. So if I were to use a panel for cross sectional analysis (questionable decision, but sometimes unavoidable), then I would use the first wave not the fifth.
    After that we can talk about the Brant test. The fact that it gives significant results is not that informative. It is not necessarily an error. Remember that models are by definition simplifications of reality. Simplification is just another word for "wrong in some useful way". Rounding a number is a model for that number: it is wrong (we ignore some of the digits) but also useful (it lets us focus on the digits that matter). The assumptions of a model are there to simplify reality, so finding evidence against those assumptions is usually uterly unspectacular: it is a model, of course it is wrong, models are by definition wrong, what else could it be... It is your job to determine whether it is so wrong that it is no longer useful. That is a subjective decision that a test cannot help you with: the p-value does not measure usefulness. The detail option in brant gives you all you need to make that assessment.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Hi Maarten,

      Thank you for your comment. I really appreciate it and it did help me in some way (the last part, that is). To start off with your red flags, let me elaborate a bit further:
      1. My research question, based on the independent variables is next one: 'does alcoholism and education influence household income'.
      2/4: The database was given by the professor. It is an exercise to learn how to use Stata (with view on writing the thesis). Thus using SHARE is obligated. Alcoholism does in some way influence household income. This, as I looked it up in academic research. But I did not give the proper information regarding the research question, as this was indeed included in it. As to why I used the fifth wave, was purely based on the sample. In the fifth wave, I had the most participants available to me. I could however try to do this research with wave 1, as this does not really change any major things in the paper itself.

      For the brant test I do have a question. How would one be able to make an assessment on the detail of the brant test? I do not have extensive knowledge about the interpretation, nor how and when it still could be useful. At the moment, I was looking to change the setup of the experiment from a ologit (OLR) to a mlogit (MLR). Is there a way to know that the brant test outcome isn't too bad, so that I can still work with ologit?

      Thanks in advance!
      Kind regards,
      Max

      Comment

      Working...
      X