Should a Logit Regression Model be used here?

Ervin Boyes

Join Date: Apr 2023

Posts: 6
#1

Should a Logit Regression Model be used here?

21 Apr 2023, 10:11

Hi all,

I am deciding what regression model to use for my project. I am working with a panel dataset that contains deaths by suicide for 41 counties over 5 decades, as well as the occupational structure of those counties and the percentage of population that is urban in the given years. I am aiming to observe the effect of different occupations on the suicide rate as well as the level of urban population. However, my dependent variable is expressed as a percentage, so either percentage of all deaths that are from suicide, or percentage of the total population that committed suicide in a given year. From my own research, it looks like I should use a Logit regression model because my dependent variable is bounded but I am confused by this because the outcome is not either 0 or 1. If using the percentage of all deaths by suicide as the dependent variable, the maximum value of my dependent variable is 3.571429 and the minimum is 0.

If this is incorrect, could I do a linear regression model where my dependent variable is the total amount of suicides in a county, but control for the total amount of deaths or total population as I have data on both?

Any advice here? Thanks in advance!

Last edited by Ervin Boyes; 21 Apr 2023, 10:13.
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#2

21 Apr 2023, 10:39

Just to add to your problems: I hope you are aware of the ecological fallacy ( https://en.wikipedia.org/wiki/Ecological_fallacy ). You cannot derive statements on the individual level based on association on an aggregate level. Applied to your situation: just because you may find that more agricultural societies have more suicides does not mean that working in agriculture makes you more likely to commit suicide. So a statement like "I am aiming to observe the effect of different occupations on the suicide rate" makes me really worried.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Ervin Boyes

Join Date: Apr 2023

Posts: 6
#3

21 Apr 2023, 12:11

Thanks for bringing my attention to that Maarten, I wasn't aware of that. I will make sure to incorporate this into my conclusions. I think I have some other independent variables that I could use to overcome this.

Regardless, would you have any recommendations on the model to use?

Either way I appreciate the help!
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#4

21 Apr 2023, 13:38

Originally posted by Ervin Boyes View Post

Thanks for bringing my attention to that Maarten, I wasn't aware of that. I will make sure to incorporate this into my conclusions. I think I have some other independent variables that I could use to overcome this.

You cannot overcome the ecological fallacy by adding by adding independent variables. Sometimes the data you have just does not contain the information you need to answer your question. I am afraid you are in such a situation.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Ervin Boyes

Join Date: Apr 2023

Posts: 6
#5

21 Apr 2023, 14:17

Originally posted by Maarten Buis View Post

You cannot overcome the ecological fallacy by adding by adding independent variables. Sometimes the data you have just does not contain the information you need to answer your question. I am afraid you are in such a situation.

Understood, I meant that I have independent variables that I can replace it with, rather than just add. Anyways, the broader question I am looking it is whether or not industrialisation as a whole can be linked with higher suicide rates, and the occupational composition of these counties represents the level of industrialisation. Not necessarily trying to say that if you are in a specific occupation you will be more likely to commit suicide. Thats why I'm also using percentage of population that is urbanised as an independent variable. Hope that makes sense.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#6

22 Apr 2023, 00:41

Originally posted by Ervin Boyes View Post

the broader question I am looking it is whether or not industrialisation as a whole can be linked with higher suicide rates, and the occupational composition of these counties represents the level of industrialisation.

ok, that sounds a lot better. As long as your text remains on that macro level, there is no problem.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 586
#7

22 Apr 2023, 02:26

Turning back to your original question ("... I am confused by this because the outcome is not either 0 or 1) although the concerns raised by Maarten Buis are valid and not solved by my reply:

You can use -fracreg logit- here, see -help fracreg-.
Comment
Ervin Boyes

Join Date: Apr 2023

Posts: 6
#8

22 Apr 2023, 04:12

Originally posted by Dirk Enzmann View Post

Turning back to your original question ("... I am confused by this because the outcome is not either 0 or 1) although the concerns raised by Maarten Buis are valid and not solved by my reply:

You can use -fracreg logit- here, see -help fracreg-.

Thank you for the help. However, when I'e used this I get the error message:

"invalid outcome variable
Your outcome variable should contain values inside the interval [0,1]"

I think this is because my dependent variable is bounded between 0 and 100, not 0 and 1. Therefore, is there a way in which I should transform the dependent variable? Can I just divide it by 100?

Also, is there a way to incorporate fixed effects into the fracreg model? Before I was just using the standard logit model and I can use fixed effects as an option for that, but not the fracreg model. Thanks!
Comment
Dirk Enzmann

Join Date: Apr 2014

Posts: 586
#9

22 Apr 2023, 04:25

Simply divide the percentages by 100.

Even if this solves the technical problem, don't ignore the ecological fallacy issue.

As to your fixed-effects question, someone else has to answer this.
Comment

Announcement

Should a Logit Regression Model be used here?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment