Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different p value from ttest and logit

    Hi,
    maybe I am overlooking something very simple here. Anyways:

    I have a dataset of 29 observations and a range of variables. The observations are given biological structures, and we are studying whether they have changed from time A to time B (a very unspecific explanation, but will do for now). If the structure has changed, it has the value "1" in a "Changed" variable. Non-changing structures are given "0".

    I want to test whether there is difference present at time A between those structures that subsequently changed, and those that did not. I have ttest and logit. In the following example, the variable describing the structure at time A is called pre_Structure

    T-test:
    Here, I am testing as follows:
    . clonevar pre_StructureAmongNochange = pre_Structure if Changed == 0
    (15 missing values generated)

    . clonevar pre_StructureAmongChanged = pre_Structure if Changed == 1
    (14 missing values generated)

    . ttest pre_StructureAmongChanged == pre_StructureAmongNochange, unpaired


    Test results:
    Two-sample t test with equal variances
    ------------------------------------------------------------------------------
    Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
    ---------+--------------------------------------------------------------------
    pre_St~d | 15 15.9 2.46366 9.541713 10.61598 21.18402
    pre_St~e | 14 8.421429 1.213508 4.540532 5.799804 11.04305
    ---------+--------------------------------------------------------------------
    combined | 29 12.28966 1.548731 8.340171 9.117224 15.46209
    ---------+--------------------------------------------------------------------
    diff | 7.478571 2.808917 1.71515 13.24199
    ------------------------------------------------------------------------------
    diff = mean(pre_SMaxChanged) - mean(pre_SMaxNoChange) t = 2.6624
    Ho: diff = 0 degrees of freedom = 27

    Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
    Pr(T < t) = 0.9935 Pr(|T| > |t|) = 0.0129 Pr(T > t) = 0.0065



    Logistic regression:
    Here, I am testing as follows:
    logit Changed pre_Structure

    Test results:
    Logistic regression Number of obs = 29
    LR chi2(1) = 7.34
    Prob > chi2 = 0.0067
    Log likelihood = -16.414763 Pseudo R2 = 0.1827

    ----------------------------------------------------------------------------------
    Changed | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
    pre_Strucutre | .1712605 .0846815 2.02 0.043 .0052879 .3372331
    _cons | -1.852314 .9571973 -1.94 0.053 -3.728387 .0237579
    ----------------------------------------------------------------------------------


    I prefer using logit, since that allows me to adjust for other variables as well. However, the p value is lower when using ttest. Why?

    Thankyou in advance.

  • #2
    Hello Torbjorn,

    Welcome to the Stata Forum.

    I fear the explanation can be found in a decent textbook on statistics. What is more, there you may get a thorough rendition instead of a brief comment, such as the one I produced below.

    Your question relates to the theoretical background of each statistical test you used, that is, paired t-test and logistic regression. In the first, we compare the means "pre" versus post", and the variables "fail" in terms of the independence assumption. This is something to consider when dealing with all models under this assumption. By the way, under a linear regression, for example, you could have yvar as the difference between means and xvar as the "change_var". I guess this was "the" trick you were trying to get. However, in the logit model we predict "change" according to the values of the covariates (one, in the example). Please keep in mind we are now dealing with the logs of the xvar as well as the probability of yvar being 1, instead of dealing with means.

    Hopefully that helps!

    Best,

    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      As a sidelight, this is very hard to read. Using a monospaced font is not a good idea, because any spaces more than 1 in a row get stripped out. Instead use code tags. See pt 12 in the FAQ.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Thankyou, both of you. Yes, I guessed this had more with my understanding of the statistical tests used than about Stata.

        Good tip, Richard Williams. I repost the code below to increase readability for any others that are interested.
        Marcos Almeida, I think you are right about the linear correlation. However, note that what I am most interested in, is difference between structures at time A, but possibly adjusted by the time span between A and B (which is not the same for all structures).
        T-test:
        Here, I am testing as follows:
        Code:
        . clonevar pre_StructureAmongNochange = pre_Structure if Changed == 0
        (15 missing values generated)
        
        . clonevar pre_StructureAmongChanged = pre_Structure if Changed == 1
        (14 missing values generated)
        
        . ttest pre_StructureAmongChanged == pre_StructureAmongNochange, unpaired


        Test results:
        Code:
        Two-sample t test with equal variances
        ------------------------------------------------------------------------------
        Variable | Obs     Mean     Std. Err.     Std. Dev.     [95% Conf. Interval]
        ---------+--------------------------------------------------------------------
        pre_St~d | 15      15.9     2.46366       9.541713      10.61598   21.18402
        pre_St~e | 14      8.421429 1.213508      4.540532      5.799804   11.04305
        ---------+--------------------------------------------------------------------
        combined | 29      12.28966 1.548731      8.340171      9.117224   15.46209
        ---------+--------------------------------------------------------------------
        diff     |         7.478571 2.808917                    1.71515    13.24199
        ------------------------------------------------------------------------------
        diff = mean(pre_SMaxChanged) - mean(pre_SMaxNoChange) t = 2.6624
        Ho: diff = 0 degrees of freedom = 27
        
        Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
        Pr(T < t) = 0.9935 Pr(|T| > |t|) = 0.0129 Pr(T > t) = 0.0065



        Logistic regression:
        Here, I am testing as follows:
        Code:
        logit Changed pre_Structure


        Test results:
        Code:
        Logistic regression Number of obs = 29
        LR chi2(1) = 7.34
        Prob > chi2 = 0.0067
        Log likelihood = -16.414763 Pseudo R2 = 0.1827
        
        ----------------------------------------------------------------------------------
        Changed       | Coef.      Std. Err.    z      P>|z|         [95% Conf. Interval]
        -----------------+----------------------------------------------------------------
        pre_Strucutre | .1712605   .0846815     2.02   0.043         .0052879    .3372331
        _cons         | -1.852314  .9571973    -1.94   0.053         -3.728387   .0237579
        ----------------------------------------------------------------------------------

        Comment


        • #5
          Torbjorn:
          if you are interested in time effects, why not model them explicitly in logistic regression?
          Code:
          logit Changed pre_Strucure timespan
          where timespan is the time taken for transforming A in B.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Sorry, but it is still quite confusing to my understanding.

            You said in #1:
            I want to test whether there is difference present at time A between those structures that subsequently changed, and those that did not
            Also, you remarked in #4:
            Code:
            note that what I am most interested in, is difference between structures at time A,
            This being so, your sample is supposed to have only observations for time A, i.e., when time is zero.

            In Stata, the command could be:

            Code:
            . logit change xvar if time == 0
            Best,

            Marcos
            Best regards,

            Marcos

            Comment

            Working...
            X