Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtlogit vs. logistic vce(cluster cid)

    Hi Listers,

    I am analysing cross-sectional data from a number of schools so I am mindful that I need to take into account the fact that data are correlated at school level.

    I am aware I could use a random effect model using -xtlogit-, but I was also considering whether a logistic model with a cluster sandwich estimator may be sufficient. I also noticed that the estimated OR differ depending on whether -xtlogit- or -logistic- is used. I am not sure why that is as I was under the impression that only SEs would be adjusted in both approaches - how are odds ratio calculated within -xtlogit-?

    Thanks in advance.

  • #2
    Originally posted by Jen Ward View Post
    I am analysing cross-sectional data from a number of schools so I am mindful that I need to take into account the fact that data are correlated at school level.
    Correlated in what way? The only way you would be able to use xtlogit is if you have multilevel data, so does it mean that you have student level data?

    Comment


    • #3
      Hi Andrew Musau - apologies this should have been clearer.

      I have data from children nested within different schools - I am looking at pass/fail exam rates depending on 3 different teaching methods. As my outcome is binary, I was planning to use -logistic- with vce:

      Code:
      logistic pass i.teach_method, vce(cluster id_school)
      I then wanted to compare the output to the one I get using -xtlogit-

      Code:
      xtset id_school
      xtlogit pass i.teach_method
      To my surprise the ORs (not just the SEs) are different so I am now questioning why, and if the sandwich estimator is sufficient to account for the clustering at school level. I'd really welcome your thoughts.

      Comment


      • #4
        Originally posted by Jen Ward View Post
        I also noticed that the estimated OR differ depending on whether -xtlogit- or -logistic- is used. I am not sure why that is as I was under the impression that only SEs would be adjusted in both approaches. . .
        No, you shouldn't expect that the regression coefficients would be the same, unless you're using xtlogit , pa instead of its default random effects option.

        Regardless of how you adjust the standard errors of the coefficients, when you fit your model with a population-average (i.e., so-called marginal) method, as you would with logistic , vce(cluster . . .)* or with xtgee , family(binomial) link(logit) corr(<whatever>), the coefficients will be attenuated compared to what you would get fitting the same model with an individual-specific (i.e., random-effects) method, as for example, with xtlogit , re.

        This phenomenon is discussed in Anders Skrondal and Sophia Rabe-Hesketh, Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models (Chapman & Hall/CRC, 2004) available here among other booksellers.

        * logistic , vce(cluster . . .) is essentially equivalent to xtgee , family(binomial) link(logit) corr(independent) vce(robust)

        Comment


        • #5
          Joseph Coveney , thank you for your response.

          I tried -xtlogit , pa- but the OR still does not match the one from -logistic , vce(cluster . . .), although it is attenuated as you mentioned.

          Could you recommend a free reference that explains this phenomenon?

          Comment


          • #6
            Originally posted by Jen Ward View Post
            Hi [USER="4687"]
            Code:
            logistic pass i.teach_method, vce(cluster id_school)
            I then wanted to compare the output to the one I get using -xtlogit-

            Code:
            xtset id_school
            xtlogit pass i.teach_method
            To my surprise the ORs (not just the SEs) are different
            The only time the two will be equivalent is when the variance of the school effects is zero (or close to zero) [you check this by the significance of the -LR test of rho=0- at the foot of the table]. So the random effects model fit via maximum likelihood by xtlogit is

            \(Pr(y_{it}=1|x_{it})= F(x_{it}\beta + \eta_i)\)

            for \(i=1, \cdots, N\) panels with the assumption that \(\eta_i\) are iid, \(N(0,\sigma^{2}_{\eta})\) and consequently uncorrelated with \(x_{it}\). This assumption is very restrictive. But anyway, you don't have panel data which is what xtlogit is intended for, so as long as you have enough observations within each school, you can just have school dummies in your logistic regression.

            Code:
            logistic pass i.teach_method i.id_school, vce(cluster id_school)
            Last edited by Andrew Musau; 29 Mar 2023, 14:53.

            Comment


            • #7
              Originally posted by Jen Ward View Post
              I tried -xtlogit , pa- but the OR still does not match the one from -logistic , vce(cluster . . .), although it is attenuated as you mentioned.
              The default working correlation for xtlogit , pa is exchangeable and so it's not to be expected that the coefficients will match those for logit , vce(cluster . . .) which assumes independent residuals. That is,
              Code:
              logistic , vce(cluster . . .)
              is equivalent to
              Code:
              xtlogit , pa corr(independent) vce(robust) or
              as mentioned in the footnote to #4.

              You can run this code to see what I mean.
              Code:
              quietly sysuse auto, clear
              
              * Fill-in missing-valued cluster variable
              set seed 2016936469
              summarize rep78, meanonly
              quietly replace rep78 = runiformint(r(min), r(max)) if missing(rep78)
              
              *
              * Begin here
              *
              
              // Random-effects logistic regression
              xtlogit foreign c.gear_ratio, i(rep78) re vce(robust) or nolog
              
              // Population average with attenuated coefficient (And see the default "Correlation: exchangeable" in the header)
              xtlogit foreign c.gear_ratio, i(rep78) pa vce(robust) or nolog
              
              // Now with independent working correlation . . .
              xtlogit foreign c.gear_ratio, i(rep78) pa corr(independent) vce(robust) or nolog
              
              // . . . which matches -logistic , vce(cluster)-
              logistic foreign c.gear_ratio, vce(cluster rep78) nolog
              
              exit

              Comment


              • #8
                Thank you both for your input.

                Andrew Musau - I have data from 40 schools so the dummy approach may not be an option. In your reply you mention that -xtlogit- was intended for panel data so I am now wondering if I should use -xtgee- for a population-level estimate rather than individual-specific model? Please feel free to point me to online resources.

                Comment


                • #9
                  What matters isn't the total number of schools but the number of students within a school. With a small number of students within a school, you may use conditional logit to condition the school effects out of the likelihood. See

                  Code:
                  help clogit
                  So the estimation becomes:

                  Code:
                  clogit pass i.teach_method, group(id_school) or
                  If you on average have 30+ students per school, your results should closely correspond to those obtained using

                  Code:
                  logistic pass i.teach_method i.id_school
                  The assumption here is that teaching methods vary within a school.

                  In your reply you mention that -xtlogit- was intended for panel data
                  Panel data here defined as having both a cross-sectional and time dimension. You do not have a panel dataset but you have multilevel data. So the melogit and clogit are more relevant commands.
                  Last edited by Andrew Musau; 30 Mar 2023, 08:52.

                  Comment


                  • #10
                    Andrew Musau Thank you for the clarification!

                    Comment

                    Working...
                    X