Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help on clarification why chi-square test shows significant relationship but Survival analysis shows no significant influence.

    I'm trying to understand the impact of waiting time at the intersection on pedestrian signal violation behaviour. I have two variables, one is gender and other is age (also other variables are there in the dataset). I used a chi-square test of independence and observed gender and age, both are significantly associated with crossing behaviour (waited/violated). Female pedestrian violates signal more often (count is more than expected count) compared to male (as illustrated in figure).
    Click image for larger version

Name:	img1.PNG
Views:	1
Size:	21.1 KB
ID:	1636008







    But when I fit a survival analysis model both with COX (see below figure) and Accelerated Failure Time, the gender and age both comes out to be insignificant @5% label.
    Click image for larger version

Name:	img2.PNG
Views:	1
Size:	6.8 KB
ID:	1636009





    I'm not able to understand if there is a relationship between gender and age with signal violation behaviour, then why the aggregate survival model estimates for gender and age are insignificant?

    Actually, this question is related to a reviewer's comment I had received for my journal, where they asked why age and gender are insignificant in the survival analysis model. To examine that, I started with a chi-square independent test.

    I'm really wondering, how would I respond to reviewer based on the current results.

    Am I missing something? ......Any suggestion on these results will be appreciated.
    Last edited by Rahul Raoniar; 11 Nov 2021, 03:03.

  • #2
    Rahul:
    what hit my eyes is coefficient=its SE in your second screenshot (BTW: please do not post screenshots but use CODE delimiters to share what you typed and what Stata gave you back. Thanks).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Sorry, sir, I can't post all the results here as the journal is under the review process. Though, I would like to know at what circumstances one can get this type of results.

      Comment


      • #4
        Rahul:
        I see the issue.
        Could you please share an excerpt of your data (via -dataex-) with anonimized variables so that the confidentiality agreement with the journal is safe?
        As an aside, please call me Carlo like all on (and many more off) this list do. Thanks.
        Last edited by Carlo Lazzaro; 11 Nov 2021, 04:02.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Other than Carlo's suggestion, posting major commands leading to both results in #1 would help.

          Comment


          • #6
            Here are the data samples

            HTML Code:
                 +----------------------------------------------------------------------------------------------------------------------------------------------+
                 |      location             departure_signal     waiting_time          gender        age        crossing_type         pace           |
                 |----------------------------------------------------------------------------------------------------------------------------------------------|
              1. | Chandni Chowk             Red                    3.235                  Male        46-60           Oblique             Normal        |
              2. | Chandni Chowk           Green                   5.039                  Male        30-45      Perpendicular        Normal        |
              3. | Chandni Chowk             Red                    6.050                  Female    30-45           Oblique             Hurried        |
              4. | Chandni Chowk             Red                   10.660                 Male        46-60      Perpendicular        Normal        |
              5. | Chandni Chowk             Red                    2.025                  Male        18-29      Perpendicular        Normal        |
                 +---------------------------------------------------------------------------------------------------------------------------------------------+

            For generating the chi2 independent test I have used the following code

            Code:
            tab gender departure_signal, row chi2 exp

            Here is the code used for survival analysis [COX-PH model]


            Code:
            stset waiting_time, failure(departure_signal==1)
            
            stcox ib(0).gender ib(0).age ib(0).crossing_type ib(0).pace , nohr cformat(%9.3f) pformat(%5.3f) sformat(%8.3f)

            The output is already mentioned in the top post.

            Kind regards,
            Rahul Raoniar
            (Stata 17 BE)
            Last edited by Rahul Raoniar; 11 Nov 2021, 09:12.

            Comment


            • #7
              Rahul, the results you obtain from "tab" are essentially unconditional correlation -- If you run a simple linear regression of departure_signal on gender, results should be consistent. The coefficient from the Cox model is basically a conditional relationship, given the values of other covariates (to some extent). Therefore it's not unusual that they are inconsistent.

              Comment


              • #8
                In many articles I have seen, researchers usually perform chi2 independent test between categorical dependent variable and categorical independent variable. If the relationship is significant then only they use it for binary logistic regression. Even they also get significant regression coefficients.

                I tried the same, but the only difference here is that I have got significant chi2 stats but insignificant regression coefficients. Is this usually happens?

                I understand what you have mentioned. Thanks for the clarification.

                Now, I just tried keeping only the gender variable in COX model, and it came out significant.


                Now I'm wondering how to check why a variable is significant, if not, why not? How to test that for a survival model?
                Last edited by Rahul Raoniar; 11 Nov 2021, 09:41.

                Comment


                • #9
                  Without control variables, any model would at best show raw correlation between y and x, and that's why they are usually consistent in statistical significance. But raw correlation is far from causation, and we can hardly conclude that females are more likely to violate traffic rules just because of their gender. As more and more variables are controlled for, no matter for what kind of model, the effect of x on y, under some conditions, approaches its true causal effect. Meanwhile, you may see the loss of significance simply because the previously significant result contains bias which fades out as covariates phase in. Being significant may be erroneous, while being insignificant may be the reflection of truth. I would trust, from your results, that gender itself isn't a major cause for traffic rules violation.

                  Comment


                  • #10
                    Rahul:
                    in addition to Fei' wise take, it is really difficult that a simpe regression (ie, the one with one predictor only) gives a fair and true view of the data generating process.
                    Even if gender turned out to be statistically significant, any decent reviewer would not trust the results of such a model.
                    I would recommend you to skim through the literature of your research field and see what others did in the past when presented with the very same research goal.
                    Eventually, about the so called bivariate correlation that is often reported in technical journal, it should be taken with caution, as it basically gets rid of the adjustement made by othere predictors in the regression model. In my opinion, it is useful to spot some unexpectedly relevant predictor, but nothing more than that.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Thank you, Fei Wang and Carlo Lazzaro for the wonderful explanation. 👏👏

                      Comment

                      Working...
                      X