Hello Statalist,
I'm new to the forum. I have a question regarding the interpretation of
and
after
. Just two premises: my level in statistics is, at best, average; I've been using Stata only for a couple of years.
I'm working on a database of 1132 patients and I'm studying the difference in recurrences after two possible treatments. Here is a sample of the database:
failure is if the patient experienced the recurrence. FU_corr_kaplan is the time variable to
the data. MB_vs_SIS is the treatment at study where 1 is the treatment I want to investigate (25 pts) and 0 is the traditional treatment (1107 pts). previous_treatment is if the patient had treatments before the treatment at study. stricture_lenght is the lenght of the stricture and stricture_lenght_dic_375 is the same variable dichotomized at 3.75.
The main problem is that I have 25 cases in one treatment and 1107 in the other treatment. Also, the groups are significantly imbalaced regarding stricture_lenght, which is an important confounder when it comes to failure.
At first I thought to use ATE to balance the population:
Note that the
operator is only because from the original DB 3 patients had missing data.
Here is my output after
------------------------------------------------------------------------------------------
| Robust
_t | Coefficient std. err. z P>|z| [95% conf. interval]
-------------------------+----------------------------------------------------------------
ATE |
MB_vs_SIS |
(1 vs 0) | -70.59496 33.14586 -2.13 0.033 -135.5597 -5.630266
-------------------------+----------------------------------------------------------------
POmean |
MB_vs_SIS |
0 | 141.9422 30.43987 4.66 0.000 82.28115 201.6032
------------------------------------------------------------------------------------------
After that,
Covariate balance summary
Raw Weighted
-----------------------------------------
Number of obs = 1,132 1,132.0
Treated obs = 25 564.9
Control obs = 1,107 567.1
-----------------------------------------
-----------------------------------------------------------------
|Standardized differences Variance ratio
| Raw Weighted Raw Weighted
----------------+------------------------------------------------
stricture_l~375 | .4475298 -.023502 1.94419 .9508906
age_at_surgery | -.1518991 -.1203498 .538402 .5715888
previous_trea~t | .0888511 -.0548936 .8937735 1.083905
preoperative_~w | .2243468 .0580785 .6094902 .5213819
-----------------------------------------------------------------
Which is, as far as I know, not perfect but I believe acceptable. Then
with p=0.66, so I assumed the balance should be ok.
In this part, apparently tratment 1 (the one I want to investigate) has a negative effect compared to treatment 0 (the traditional one).
Then I had read different articles that use an adjusted Kaplan Meier and an adjusted Cox regression using the inverse probability weights. So I tried to emulate what they did.
And lastly I run
including the significant variables that I selected at univariable Cox regression, note that the variable identifying treatment (MB_vs_SIS) is not included as not significant:
Failure _d: failure
Analysis time _t: FU_corr_kaplan
Weight: [pweight=ipw]
(sum of wgt is 1,509.95566356182)
Iteration 0: log pseudolikelihood = -992.35729
Iteration 1: log pseudolikelihood = -979.87503
Iteration 2: log pseudolikelihood = -979.74657
Iteration 3: log pseudolikelihood = -979.74652
Refining estimates:
Iteration 0: log pseudolikelihood = -979.74652
Cox regression with Breslow method for ties
No. of subjects = 1,510 Number of obs = 1,132
No. of failures = 199
Time at risk = 139,978.741
Wald chi2(2) = 12.74
Log pseudolikelihood = -979.74652 Prob > chi2 = 0.0017
------------------------------------------------------------------------------------------
| Robust
_t | Haz. ratio std. err. z P>|z| [95% conf. interval]
-------------------------+----------------------------------------------------------------
stricture_lenght_dic_375 | 1.808034 .5152776 2.08 0.038 1.034237 3.160773
age_at_surgery | 1.022156 .0072416 3.09 0.002 1.008061 1.036448
------------------------------------------------------------------------------------------
The
resulted in a p=0.32.
I did not post the whole code but I also performed the Cox regression with the unweighted data and the variables in
are all significant (ph assumption is valid).
So here are my questions. Using
from the original survival data without sampling weights, I obtained a significant negative effect for the treatment at study (MB_vs_SIS). However, when performing a Cox regression from survival data accounting for sampling weights, the treatment seems to have a negligeable effect.
Andrea
I'm new to the forum. I have a question regarding the interpretation of
Code:
stteffects ipw
Code:
stcox
Code:
stset timevar pweight[weight] , failure(failvar) scale(1)
I'm working on a database of 1132 patients and I'm studying the difference in recurrences after two possible treatments. Here is a sample of the database:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input byte age_at_surgery float(failure FU_corr_kaplan MB_vs_SIS) byte previous_treatment double stricture_lenght float stricture_lenght_dic_375 20 0 102.87603 0 0 2.5 0 28 0 134.1157 0 0 2.5 0 37 0 176.9256 0 1 5 1 39 0 169.9504 0 1 3 0 50 0 127.86777 0 0 1.5 0 66 0 158.9091 0 0 2.5 0 57 0 165.98347 0 1 2 0 27 0 163.0744 0 1 3 0 46 0 105.58678 0 1 1.5 0 20 0 105.38843 0 1 2 0 end
Code:
stset
The main problem is that I have 25 cases in one treatment and 1107 in the other treatment. Also, the groups are significantly imbalaced regarding stricture_lenght, which is an important confounder when it comes to failure.
At first I thought to use ATE to balance the population:
Code:
stset FU_corr_kaplan if MB_vs_SIS !=. & stricture_lenght !=., failure(failure) scale(1)
Code:
if
Here is my output after
Code:
stteffects ipw (MB_vs_SIS stricture_lenght_dic_375 age_at_surgery previous_treatment preoperative_flow) (stricture_lenght_dic_375 age_at_surgery previous_treatment preoperative_flow )
| Robust
_t | Coefficient std. err. z P>|z| [95% conf. interval]
-------------------------+----------------------------------------------------------------
ATE |
MB_vs_SIS |
(1 vs 0) | -70.59496 33.14586 -2.13 0.033 -135.5597 -5.630266
-------------------------+----------------------------------------------------------------
POmean |
MB_vs_SIS |
0 | 141.9422 30.43987 4.66 0.000 82.28115 201.6032
------------------------------------------------------------------------------------------
After that,
Code:
tebalance summarize
Raw Weighted
-----------------------------------------
Number of obs = 1,132 1,132.0
Treated obs = 25 564.9
Control obs = 1,107 567.1
-----------------------------------------
-----------------------------------------------------------------
|Standardized differences Variance ratio
| Raw Weighted Raw Weighted
----------------+------------------------------------------------
stricture_l~375 | .4475298 -.023502 1.94419 .9508906
age_at_surgery | -.1518991 -.1203498 .538402 .5715888
previous_trea~t | .0888511 -.0548936 .8937735 1.083905
preoperative_~w | .2243468 .0580785 .6094902 .5213819
-----------------------------------------------------------------
Which is, as far as I know, not perfect but I believe acceptable. Then
Code:
tebalance overid
In this part, apparently tratment 1 (the one I want to investigate) has a negative effect compared to treatment 0 (the traditional one).
Then I had read different articles that use an adjusted Kaplan Meier and an adjusted Cox regression using the inverse probability weights. So I tried to emulate what they did.
Code:
logit MB_vs_SIS stricture_lenght_dic_375 age_at_surgery previous_treatment predict pi, p gen ipw=. replace ipw=1/pi if MB_vs_SIS==1 replace ipw=1/(1-pi) if MB_vs_SIS==0 stset FU_corr_kaplan if MB_vs_SIS !=. & stricture_lenght !=. [pweight = ipw], failure(failure) scale(1)
Code:
stcox stricture_lenght age_at_surgery
Failure _d: failure
Analysis time _t: FU_corr_kaplan
Weight: [pweight=ipw]
(sum of wgt is 1,509.95566356182)
Iteration 0: log pseudolikelihood = -992.35729
Iteration 1: log pseudolikelihood = -979.87503
Iteration 2: log pseudolikelihood = -979.74657
Iteration 3: log pseudolikelihood = -979.74652
Refining estimates:
Iteration 0: log pseudolikelihood = -979.74652
Cox regression with Breslow method for ties
No. of subjects = 1,510 Number of obs = 1,132
No. of failures = 199
Time at risk = 139,978.741
Wald chi2(2) = 12.74
Log pseudolikelihood = -979.74652 Prob > chi2 = 0.0017
------------------------------------------------------------------------------------------
| Robust
_t | Haz. ratio std. err. z P>|z| [95% conf. interval]
-------------------------+----------------------------------------------------------------
stricture_lenght_dic_375 | 1.808034 .5152776 2.08 0.038 1.034237 3.160773
age_at_surgery | 1.022156 .0072416 3.09 0.002 1.008061 1.036448
------------------------------------------------------------------------------------------
The
Code:
estat phtest
I did not post the whole code but I also performed the Cox regression with the unweighted data and the variables in
Code:
stcox stricture_lenght_dic_375 previous_treatment MB_vs_SIS age_at_surgery
So here are my questions. Using
Code:
stteffects ipw (MB_vs_SIS stricture_lenght_dic_375 age_at_surgery previous_treatment preoperative_flow) (stricture_lenght_dic_375 age_at_surgery previous_treatment preoperative_flow )
- Did I misspecified some codes in the analysis?
- How should I interpret the different results in the treatment effect?
- Finally, in my situation is more appropriate to use stteffects ipw or stcox accounting for sampling weights? I thought ATE were easier to understand and fitted well in my database, balancing variables among treatments but I may be wrong
Andrea

Comment