Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtgee

    Hi everyone,

    I have a continues, all positive outcome (CTDI) and it is skewed. I checked ln(CTDI) which is still not normal. I'm not sure which of the following options is the correct one.
    1- xtgee CTDI var1 var2, corr(independent) (I've used this option because I know that GEE approach forgoes the distribution assumption, providing valid inference regardless of the distribution of the data.)

    2-xtgee ln_CTDI var1 var2, corr(independent) efom (I ran this model and got very different results from option 1)


    3-xtgee CTDI var1 var2, family(gamma) link(log) corr(independent)

    Regards,

  • #2
    If your concern is normality, then what do the residuals of
    Code:
    regress CTDI var1 var2
    look like?

    I wasn't aware that GEE forgoes distribution assumptions.

    Comment


    • #3
      Thanks for the reply.



      The attach is residual-versus-fitted plot.

      Code:
      regress CTDI  PatientWeight Age_year b1.Sex b2.State b4.PracticeTypeID
      rvfplot, yline(0)
      Click image for larger version

Name:	residual-versus-fitted plot.png
Views:	1
Size:	31.9 KB
ID:	1395592

      Last edited by Masoumeh Sanagou; 30 May 2017, 22:27.

      Comment


      • #4
        It looks like you have a bigger problem than skew, but go ahead and do the same with logarithmically transformed response values.

        Is State your panel ID?

        Comment


        • #5
          The structure of the data: patients go to facilities to do CT scan. There are patient-level characteristics e.g. age, sex, weight and facility-level characteristics e.g. state, practice type(private, public).
          each facility has one practice ID.





          The panel ID is PracticeID_code

          Code:
          xtset PracticeID_code
          regress ln_CTDI PatientWeight Age_year b1.Sex b2.State b4.PracticeTypeID
           rvfplot, yline(0)
          Click image for larger version

Name:	residual-versus-fitted plot-ln-CTDI.png
Views:	1
Size:	28.2 KB
ID:	1395759

          Last edited by Masoumeh Sanagou; 31 May 2017, 15:34.

          Comment


          • #6
            I don't know whether adding the practice ID will take care of it or not,
            Code:
            regress CTDI PatientWeight Age_year b1.Sex /* b2.State b4.PracticeTypeID omit these two because of collinearity */ i.PracticeID_code
            but your model is misspecified. I recommend that you work on that first, especially if you're planning to use noncanonical link functions, such as your third model above.

            Comment


            • #7
              Masoumeh:
              why -xtset-ting your data before -regress-?
              Besides, if patients are nested within hospitals which, in turn, are nested within states, why not considering -mixed- (I assume that your -depvar- is continuous)?
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment


              • #8
                Originally posted by Carlo Lazzaro View Post
                Masoumeh:
                why -xtset-ting your data before -regress-?
                Besides, if patients are nested within hospitals which, in turn, are nested within states, why not considering -mixed- (I assume that your -depvar- is continuous)?
                Sorry that was a typo mistake. The two graphs are for
                Code:
                regress CTDI  PatientWeight Age_year b1.Sex b2.State b4.PracticeTypeID
                rvfplot, yline(0
                and
                Code:
                regress ln_CTDI PatientWeight Age_year b1.Sex b2.State b4.PracticeTypeID
                rvfplot, yline(0)


                and yes the depvar is continuous.

                Comment


                • #9
                  Originally posted by Joseph Coveney View Post
                  I don't know whether adding the practice ID will take care of it or not,
                  Code:
                  regress CTDI PatientWeight Age_year b1.Sex /* b2.State b4.PracticeTypeID omit these two because of collinearity */ i.PracticeID_code
                  but your model is misspecified. I recommend that you work on that first, especially if you're planning to use noncanonical link functions, such as your third model above.

                  I want to see the effect of the two variables b2.State b4.PracticeTypeID on outcome.


                  Could you please make it more clear to me that what "your model is misspecified" means and what should I do for that?

                  Comment


                  • #10
                    Code:
                    regress CTDI PatientWeight Age_year b1.Sex i.PracticeID_code
                    rvfplot, yline(0)

                    Click image for larger version

Name:	residual-versus-fitted plot.png
Views:	1
Size:	26.6 KB
ID:	1396198


                    Code:
                    regress ln_CTDI  PatientWeight Age_year b1.Sex i.PracticeID_code
                    rvfplot, yline(0)

                    Click image for larger version

Name:	residual-versus-fitted plot-ln-CTDI.png
Views:	1
Size:	25.4 KB
ID:	1396199

                    Comment


                    • #11
                      Originally posted by Masoumeh Sanagou View Post
                      Could you please make it more clear to me that what "your model is misspecified" means and what should I do for that?
                      Well, look at the -rvfplot-. I'm guessing that the radiologists set the output of the CT scanner (hence the CTDI) based upon some indirect nonlinear function of the patient's body weight (perhaps thoracic or abdominal circumference, if it's not a head scan), and the inclusion of body weight linearly in the model is what's giving rise to the strange appearance.

                      You can use other regression diagnostic plots to help pin down what's going on, if you're inclined to. But you seem to be using body weight etc. as covariates and interest lies in geography and medical specialty (below). You're not really interested in normality or skew or obtaining a well-fitted explanatory model involving these covariates.


                      Originally posted by Masoumeh Sanagou View Post
                      I want to see the effect of the two variables b2.State b4.PracticeTypeID on outcome.
                      You seem to have a large sample where efficiency wouldn't be an overwhelming consideration and so you might just want to rely upon the -robust- option of -xtgee- with canonical link functions to produce usable asymptotic standard errors of these variables' coefficients even in the presence of model misspecification. In that case, it might not pay to transform CTDI, and you could consider just going with your first model above. You could even consider
                      Code:
                      regress CTDI b2.State b4.PracticeTypeID, vce(cluster PracticeID_code)
                      and include the covariates if you think that they will increase efficiency.




                      Comment


                      • #12
                        Thanks for the reply.

                        So you recommend either
                        Code:
                        regress CTDI  PatientWeight Age_year b1.Sex b2.State b4.PracticeTypeID, vce (cluster PracticeID_code)
                        or
                        Code:
                        xtset PracticeID_code
                        xtgee CTDI  PatientWeight Age_year b1.Sex  b2.State b4.PracticeTypeID    ,  corr(independent) vce(robust)

                        Comment


                        • #13
                          Originally posted by Masoumeh Sanagou View Post
                          So you recommend
                          I don't know what your research objective is, and the approach that I would take to attaining it might be something completely different from the path that you're on.

                          So, no, I wasn't making recommendations for particular regression models. I was pointing out considerations from among those in your current approach.

                          Here are a few other considerations that you might want to entertain if you haven't already (they're certainly not clear to me based upon what's gone before in this thread): why you've chosen the covariates that you have, why you are interested in the association between state and CTDI (I would have suspected that radiological health laws and regulations are fairly uniform and uniformly enforced), why you are interested in the association between referring physician's medical specialty and CTDI (that is, if that's what PracticeTypeID refers to—is neurologist versus gastroenterologist intended to be a surrogate for absorbed dose in CT scanning of head versus abdomen?), and why you suppose exogeneity of the referring physician's practice given the fixed effects that you're including in the model.

                          Comment


                          • #14
                            Thanks for pointing out those considerations. I will make those clear and then think about approaches.

                            Comment

                            Working...
                            X