Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression when dep. variable is a proportion: a few questions

    Hello

    I am doing an analysis of the determinants of census tract unemployment rates. Some of the previous literature on my topic has used straight OLS regression, and I started with this type of analysis, but it seems to me after my own further reading that a Generalized Linear Model is better. This is especially because I am interested in presenting predicted values for the census tracts' unemployment rates based on my regression and I would like these to be appropriately bounded. My unemployment rates include 0s for some census tracts so I would need to take this into account.

    My questions are:

    1) whether -fracreg logit- is equivalent to -glm- with a logit link and binomial family? (I have read about using the -glm- version in a few places including here but see that fracreg is a new-ish command which seems to serve the same purpose). Can I specify an equivalent to the -robust- option when using -fracreg logit-?

    2) if using -fracreg-, on what basis should I decide to use a fractional probit (-fracreg probit-) or fractional logit (-fracreg logit-) regression?

    3) a simply (probably ignorant) question of interpretation: I see that the -fracreg- and -glm- regressions mentioned above don't report an R-squared value. Is there an equivalent measure for these regressions I can calculate? My OLS R-squared values have been reasonably high and this has been a point of reassurance for me, so I'd like to see how these models compare (though I know R-squared isn't everything!).

    4) if using these models are there any additional restrictions or assumptions (such as additional assumptions beyond the BLUE of OLS) that I should keep in mind? With my OLS regressions I have taken the natural log of unemployment rates (makes my residuals more normal, higher R-squared, and convenient interpretation). Could I do the same with the -fracreg or -glm- regressions above?

    It's been a while since I formally studied limited dependent variables so please excuse my ignorance on these issues. Thank you very much for any help! This is my first post in this forum but I have always found it to be a very useful place for information on Stata and information on statistical analysis generally.

    Regards

    Oliver Kendrick



  • #2
    Hello

    I worry that by asking so many questions I have turned people off from answering one or two of them. I just wanted to say that I'd be very grateful for answers to any or all of the questions I've raised above.

    I also wanted to say, in line with the FAQ, that I have now cross-posted this question here.

    Thanks for any help or advice,

    Oliver

    Comment


    • #3
      On 1) you can always try it. But fracreg defaults to vce(robust) so if you do it right both fracreg and glm should give the same results. fracreg's big advantage would be if you want to include a heteroskedasticity equation.

      For 2), it is generally a matter of what is most common in your field. logit and probit models rarely differ by very much. Use probit if you also want to use the het option.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Also, for 3), fracreg does give you pseudo R^2. Pseudo R^2 statistics tend to be smaller than OLS R^2 statistics.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Thanks for telling us about cross-posting. My own view is that your question is not a good fit for Stack Overflow, which is a programming forum. At a wild guess, you did ask too many questions at once!

          Also on 3) http://www.stata.com/support/faqs/statistics/r-squared/

          Last edited by Nick Cox; 02 May 2016, 06:19.

          Comment


          • #6
            Hi Richard and Nick

            Richard, thanks for the suggestions. I guess it's quite obvious that I should have just tried the different commands before posting here! I think I got too caught up in trying to decipher the theory of these different techniques. I will give the different versions a go as soon as I get a chance. Also thank you for your comment re probits and logits.

            And Nick, thanks very much for the link - that's very useful. Also for the comment re Stack Overflow - I'm pretty new to these kinds of forums.

            Best,

            Josh

            Comment

            Working...
            X