Different p value from ttest and logit

Torbjorn Skodvin

Join Date: Feb 2016

Posts: 22
#1

Different p value from ttest and logit

27 Feb 2016, 01:46

Hi,
maybe I am overlooking something very simple here. Anyways:

I have a dataset of 29 observations and a range of variables. The observations are given biological structures, and we are studying whether they have changed from time A to time B (a very unspecific explanation, but will do for now). If the structure has changed, it has the value "1" in a "Changed" variable. Non-changing structures are given "0".

I want to test whether there is difference present at time A between those structures that subsequently changed, and those that did not. I have ttest and logit. In the following example, the variable describing the structure at time A is called pre_Structure

T-test:
Here, I am testing as follows:
. clonevar pre_StructureAmongNochange = pre_Structure if Changed == 0
(15 missing values generated)

. clonevar pre_StructureAmongChanged = pre_Structure if Changed == 1
(14 missing values generated)

. ttest pre_StructureAmongChanged == pre_StructureAmongNochange, unpaired

Test results:
Two-sample t test with equal variances
------------------------------------------------------------------------------
Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre_St~d | 15 15.9 2.46366 9.541713 10.61598 21.18402
pre_St~e | 14 8.421429 1.213508 4.540532 5.799804 11.04305
---------+--------------------------------------------------------------------
combined | 29 12.28966 1.548731 8.340171 9.117224 15.46209
---------+--------------------------------------------------------------------
diff | 7.478571 2.808917 1.71515 13.24199
------------------------------------------------------------------------------
diff = mean(pre_SMaxChanged) - mean(pre_SMaxNoChange) t = 2.6624
Ho: diff = 0 degrees of freedom = 27

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.9935 Pr(|T| > |t|) = 0.0129 Pr(T > t) = 0.0065

Logistic regression:
Here, I am testing as follows:
logit Changed pre_Structure

Test results:
Logistic regression Number of obs = 29
LR chi2(1) = 7.34
Prob > chi2 = 0.0067
Log likelihood = -16.414763 Pseudo R2 = 0.1827

----------------------------------------------------------------------------------
Changed | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
pre_Strucutre | .1712605 .0846815 2.02 0.043 .0052879 .3372331
_cons | -1.852314 .9571973 -1.94 0.053 -3.728387 .0237579
----------------------------------------------------------------------------------

I prefer using logit, since that allows me to adjust for other variables as well. However, the p value is lower when using ttest. Why?

Thankyou in advance.
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

27 Feb 2016, 06:33

Hello Torbjorn,

Welcome to the Stata Forum.

I fear the explanation can be found in a decent textbook on statistics. What is more, there you may get a thorough rendition instead of a brief comment, such as the one I produced below.

Your question relates to the theoretical background of each statistical test you used, that is, paired t-test and logistic regression. In the first, we compare the means "pre" versus post", and the variables "fail" in terms of the independence assumption. This is something to consider when dealing with all models under this assumption. By the way, under a linear regression, for example, you could have yvar as the difference between means and xvar as the "change_var". I guess this was "the" trick you were trying to get. However, in the logit model we predict "change" according to the values of the covariates (one, in the example). Please keep in mind we are now dealing with the logs of the xvar as well as the probability of yvar being 1, instead of dealing with means.

Hopefully that helps!

Best,

Marcos

Best regards,

Marcos
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4994
#3

27 Feb 2016, 07:36

As a sidelight, this is very hard to read. Using a monospaced font is not a good idea, because any spaces more than 1 in a row get stripped out. Instead use code tags. See pt 12 in the FAQ.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment

Torbjorn Skodvin

Join Date: Feb 2016
Posts: 22

01 Mar 2016, 22:23

Thankyou, both of you. Yes, I guessed this had more with my understanding of the statistical tests used than about Stata.

Good tip, Richard Williams. I repost the code below to increase readability for any others that are interested.
Marcos Almeida, I think you are right about the linear correlation. However, note that what I am most interested in, is difference between structures at time A, but possibly adjusted by the time span between A and B (which is not the same for all structures).
T-test:
Here, I am testing as follows:

Code:

. clonevar pre_StructureAmongNochange = pre_Structure if Changed == 0
(15 missing values generated)

. clonevar pre_StructureAmongChanged = pre_Structure if Changed == 1
(14 missing values generated)

. ttest pre_StructureAmongChanged == pre_StructureAmongNochange, unpaired

Test results:

Code:

Two-sample t test with equal variances
------------------------------------------------------------------------------
Variable | Obs     Mean     Std. Err.     Std. Dev.     [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre_St~d | 15      15.9     2.46366       9.541713      10.61598   21.18402
pre_St~e | 14      8.421429 1.213508      4.540532      5.799804   11.04305
---------+--------------------------------------------------------------------
combined | 29      12.28966 1.548731      8.340171      9.117224   15.46209
---------+--------------------------------------------------------------------
diff     |         7.478571 2.808917                    1.71515    13.24199
------------------------------------------------------------------------------
diff = mean(pre_SMaxChanged) - mean(pre_SMaxNoChange) t = 2.6624
Ho: diff = 0 degrees of freedom = 27

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.9935 Pr(|T| > |t|) = 0.0129 Pr(T > t) = 0.0065

Logistic regression:
Here, I am testing as follows:

Code:

logit Changed pre_Structure

Test results:

Code:

Logistic regression Number of obs = 29
LR chi2(1) = 7.34
Prob > chi2 = 0.0067
Log likelihood = -16.414763 Pseudo R2 = 0.1827

----------------------------------------------------------------------------------
Changed       | Coef.      Std. Err.    z      P>|z|         [95% Conf. Interval]
-----------------+----------------------------------------------------------------
pre_Strucutre | .1712605   .0846815     2.02   0.043         .0052879    .3372331
_cons         | -1.852314  .9571973    -1.94   0.053         -3.728387   .0237579
----------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#5

01 Mar 2016, 22:35

Torbjorn:
if you are interested in time effects, why not model them explicitly in logistic regression?

Code:

logit Changed pre_Strucure timespan

where timespan is the time taken for transforming A in B.

Kind regards,
Carlo
(Stata 19.0)
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

02 Mar 2016, 03:33

Sorry, but it is still quite confusing to my understanding.

You said in #1:

I want to test whether there is difference present at time A between those structures that subsequently changed, and those that did not

Also, you remarked in #4:

Code:

note that what I am most interested in, is difference between structures at time A,

This being so, your sample is supposed to have only observations for time A, i.e., when time is zero.

In Stata, the command could be:

Code:

. logit change xvar if time == 0

Best,

Marcos

Best regards,

Marcos
Comment

Announcement