Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should a Logit Regression Model be used here?

    Hi all,


    I am deciding what regression model to use for my project. I am working with a panel dataset that contains deaths by suicide for 41 counties over 5 decades, as well as the occupational structure of those counties and the percentage of population that is urban in the given years. I am aiming to observe the effect of different occupations on the suicide rate as well as the level of urban population. However, my dependent variable is expressed as a percentage, so either percentage of all deaths that are from suicide, or percentage of the total population that committed suicide in a given year. From my own research, it looks like I should use a Logit regression model because my dependent variable is bounded but I am confused by this because the outcome is not either 0 or 1. If using the percentage of all deaths by suicide as the dependent variable, the maximum value of my dependent variable is 3.571429 and the minimum is 0.

    If this is incorrect, could I do a linear regression model where my dependent variable is the total amount of suicides in a county, but control for the total amount of deaths or total population as I have data on both?

    Any advice here? Thanks in advance!
    Last edited by Ervin Boyes; 21 Apr 2023, 10:13.

  • #2
    Just to add to your problems: I hope you are aware of the ecological fallacy ( https://en.wikipedia.org/wiki/Ecological_fallacy ). You cannot derive statements on the individual level based on association on an aggregate level. Applied to your situation: just because you may find that more agricultural societies have more suicides does not mean that working in agriculture makes you more likely to commit suicide. So a statement like "I am aiming to observe the effect of different occupations on the suicide rate" makes me really worried.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thanks for bringing my attention to that Maarten, I wasn't aware of that. I will make sure to incorporate this into my conclusions. I think I have some other independent variables that I could use to overcome this.

      Regardless, would you have any recommendations on the model to use?

      Either way I appreciate the help!


      Comment


      • #4
        Originally posted by Ervin Boyes View Post
        Thanks for bringing my attention to that Maarten, I wasn't aware of that. I will make sure to incorporate this into my conclusions. I think I have some other independent variables that I could use to overcome this.
        You cannot overcome the ecological fallacy by adding by adding independent variables. Sometimes the data you have just does not contain the information you need to answer your question. I am afraid you are in such a situation.

        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Originally posted by Maarten Buis View Post

          You cannot overcome the ecological fallacy by adding by adding independent variables. Sometimes the data you have just does not contain the information you need to answer your question. I am afraid you are in such a situation.
          Understood, I meant that I have independent variables that I can replace it with, rather than just add. Anyways, the broader question I am looking it is whether or not industrialisation as a whole can be linked with higher suicide rates, and the occupational composition of these counties represents the level of industrialisation. Not necessarily trying to say that if you are in a specific occupation you will be more likely to commit suicide. Thats why I'm also using percentage of population that is urbanised as an independent variable. Hope that makes sense.

          Comment


          • #6
            Originally posted by Ervin Boyes View Post
            the broader question I am looking it is whether or not industrialisation as a whole can be linked with higher suicide rates, and the occupational composition of these counties represents the level of industrialisation.
            ok, that sounds a lot better. As long as your text remains on that macro level, there is no problem.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Turning back to your original question ("... I am confused by this because the outcome is not either 0 or 1) although the concerns raised by Maarten Buis are valid and not solved by my reply:

              You can use -fracreg logit- here, see -help fracreg-.

              Comment


              • #8
                Originally posted by Dirk Enzmann View Post
                Turning back to your original question ("... I am confused by this because the outcome is not either 0 or 1) although the concerns raised by Maarten Buis are valid and not solved by my reply:

                You can use -fracreg logit- here, see -help fracreg-.
                Thank you for the help. However, when I'e used this I get the error message:

                "invalid outcome variable
                Your outcome variable should contain values inside the interval [0,1]"

                I think this is because my dependent variable is bounded between 0 and 100, not 0 and 1. Therefore, is there a way in which I should transform the dependent variable? Can I just divide it by 100?

                Also, is there a way to incorporate fixed effects into the fracreg model? Before I was just using the standard logit model and I can use fixed effects as an option for that, but not the fracreg model. Thanks!

                Comment


                • #9
                  Simply divide the percentages by 100.

                  Even if this solves the technical problem, don't ignore the ecological fallacy issue.

                  As to your fixed-effects question, someone else has to answer this.

                  Comment

                  Working...
                  X