Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • logistic regression R and Stata – grouping variable

    Hello,

    I mostly use Stata 13 for my regression analysis. I want to conduct a logistic regression on a proportion/number of success. Because I receive errors in Stata I did not expect nor understand (if there are Stata experts who want to know more about the problems I face and can potentially help me solve them, I would be glad to give more details), I want to repeat the analysis in R. In Stata I would use the command: xtlogit DEP_PROP INDEP_A INDEP_B INDEP_C, i(ID). ID is the identifier for each subject. There are eight lines with data for each subject because there are three within factors (INDEP_A, B, C) with two levels each (0 and 1). I can repeat this analysis in R by using the command: glm(DEP_SUC ~ INDEP_A + INDEP_B + INDEP_C, family = “binomial”). DEP_SUC is here a table with the successes and misses per row. Again, there are eight rows for each subject. But while I know how to group these lines in Stata by using the option i(ID ), I do not know what to do in R. I have search for more information about the i() command, but did not find any usefull information.

    So, to summarize: I want to find out how three variables (binary) influence a proportion and use logistic regression. In Stata I can group multiple lines per subject using the i( ) command in logistic regression. What is the equivalent in R?

    Thank you in advance!

    Cross posted at: http://stats.stackexchange.com/quest...on-r-and-stata
    https://r-forge.r-project.org/forum/...78&group_id=34


  • #2
    You are asking about R code. It's a grey area, but I don't think that's on-topic here even though you want a translation of Stata,

    Comment


    • #3
      I agree with Nick. It is probably better to tackle the prboems you face in Stata in this forum, so please provide more details on the error you get.

      Best
      Daniel

      Comment


      • #4
        Dear Bruno,

        You can use:

        Code:
        xtgee y i.x1 i.x2, family(binomial) link(logit) vce(robust)
        The previous command gives you panel/longitudinal logistic population-averaged parameters, which is what you want. The parameters you obtain using Stata are correct. I cannot guarantee they are the same as what software's that do not certify their results would give you.

        Stata's glm is cross sectional. I believe this is not what you want. If I am wrong, you can try the new cross-sectional fractional response models in Stata 14 using clustered standard errors:

        Code:
        fracreg logit y i.x1 i.x2, vce(cluster id)
        Best,

        Enrique

        Comment


        • #5
          Thank you for your answers.
          I think I did find a solutions, the threat may be closed.

          Comment

          Working...
          X