Hello,
I am seeking advice on the best syntax to accomplish a series of regression analyses. My apologies, but I am new to Stata and have never run this type of analysis before. I cannot seem to determine how to apply what I think would be a loop command.
Below, I have inserted an excerpt of 100 observations from an approximately 10,000-observation dataset in which this analysis would be implemented.
The specific steps I am trying to accomplish are:
1) I wish to serially run model iterations using logistic on a designated set of observations. The variable obs identifies the observation numbers. The series of models would progressively use one additional observation at a time.
2) I wish to calculate a predicted probability of the outcome associated with each model and save that value as the variable yprob on the observation after the last observation used in the model iteration.
3) I wish to then run a new model in which the next observation (the one on which the prior predicted value is listed) is added to those used in the model and a new predicted value is designated for yprob on the next observation.
4) This process would continue for each subsequent observation up to a selected point in the data.
So, if we ran models beginning with iteration 1 using observations 1 – 50, we would save the yprob value on the observation corresponding to obs = 51. Model iteration 2 would be run on observations 1 – 51 and we would save its yprob value on the observation corresponding to obs = 52, etc., to a designated stopping point or until running out of observations.
If I only were to only employ obs 1 – 52 in the process, then the following crude approach works:
Of course, this approach would not be feasible for thousands of observations. I have attempted to use forvalues but I am frankly uncertain how to structure the syntax - what I am doing doesn't seem to match the options. I also saw the command rangestat but I understand it will not run logistic regression.
Any advice on best syntax will be greatly appreciated, thank you.
I am seeking advice on the best syntax to accomplish a series of regression analyses. My apologies, but I am new to Stata and have never run this type of analysis before. I cannot seem to determine how to apply what I think would be a loop command.
Below, I have inserted an excerpt of 100 observations from an approximately 10,000-observation dataset in which this analysis would be implemented.
The specific steps I am trying to accomplish are:
1) I wish to serially run model iterations using logistic on a designated set of observations. The variable obs identifies the observation numbers. The series of models would progressively use one additional observation at a time.
2) I wish to calculate a predicted probability of the outcome associated with each model and save that value as the variable yprob on the observation after the last observation used in the model iteration.
3) I wish to then run a new model in which the next observation (the one on which the prior predicted value is listed) is added to those used in the model and a new predicted value is designated for yprob on the next observation.
4) This process would continue for each subsequent observation up to a selected point in the data.
So, if we ran models beginning with iteration 1 using observations 1 – 50, we would save the yprob value on the observation corresponding to obs = 51. Model iteration 2 would be run on observations 1 – 51 and we would save its yprob value on the observation corresponding to obs = 52, etc., to a designated stopping point or until running out of observations.
If I only were to only employ obs 1 – 52 in the process, then the following crude approach works:
Code:
gen yprob = . label var yprob "predicted probability" logistic y x1 ib4.x2 ib3.x3 i.x4 if index<=50, nolog predict tempval if index==51 replace yprob = tempval if index==51 drop tempval logistic y x1 ib4.x2 ib3.x3 i.x4 if index<=51, nolog predict tempval if index==52 replace yprob = tempval if index==52 drop tempval
Any advice on best syntax will be greatly appreciated, thank you.
Code:
obs y x1 x2 x3 x4 1 0 0 3 3 3 2 1 1 2 1 2 3 1 0 2 2 3 4 0 1 3 2 2 5 1 0 1 4 3 6 0 1 4 1 2 7 0 0 2 4 4 8 1 1 3 4 1 9 0 0 2 2 1 10 1 1 2 3 4 11 0 0 1 3 2 12 1 1 3 2 3 13 1 0 3 2 4 14 0 1 1 2 1 15 1 0 1 4 4 16 0 1 1 4 1 17 1 0 2 3 2 18 0 1 1 3 3 19 1 0 3 1 2 20 0 1 2 2 3 21 0 0 3 1 3 22 1 1 4 1 2 23 0 0 4 2 4 24 1 1 3 1 1 25 0 0 1 4 3 26 1 1 4 1 2 27 1 0 1 3 2 28 0 1 4 3 3 29 1 0 4 4 2 30 0 1 4 4 3 31 1 0 2 3 2 32 0 1 1 3 3 33 0 0 3 3 3 34 1 1 2 1 2 35 0 0 2 2 3 36 1 1 3 2 2 37 0 0 3 4 2 38 1 1 4 1 3 39 0 0 2 2 1 40 1 1 2 3 4 41 0 0 1 3 2 42 1 1 3 2 3 43 0 0 3 2 4 44 1 1 1 2 1 45 0 0 1 4 4 46 1 1 1 4 1 47 0 0 3 1 2 48 1 1 2 2 3 49 0 0 3 1 3 50 1 1 4 1 2 51 0 0 1 4 2 52 1 1 4 2 3 53 1 0 2 4 4 54 0 1 4 1 1 55 1 0 1 3 2 56 0 1 4 3 3 57 0 0 4 4 2 58 1 1 4 4 3 59 1 0 3 4 2 60 0 1 4 1 3 61 0 0 3 3 3 62 1 1 1 4 2 63 0 0 2 3 3 64 1 1 2 2 2 65 0 0 1 3 3 66 1 1 1 3 2 67 0 0 1 2 1 68 1 1 3 2 4 69 0 0 1 4 3 70 1 1 4 4 2 71 1 0 1 3 2 72 0 1 2 3 3 73 1 0 3 1 4 74 0 1 4 1 1 75 1 0 3 1 1 76 0 1 2 2 4 77 0 0 1 4 2 78 1 1 4 2 3 79 1 0 2 4 4 80 0 1 4 1 1 81 0 0 3 2 1 82 1 1 2 2 4 83 0 0 3 1 1 84 1 1 3 4 4 85 0 0 1 3 2 86 1 1 2 3 3 87 0 0 3 1 1 88 1 1 2 2 4 89 0 0 3 3 3 90 1 1 1 4 2 91 1 0 3 2 2 92 0 1 4 3 3 93 1 0 1 4 1 94 0 1 4 4 4 95 1 0 3 1 1 96 0 1 4 1 4 97 0 0 2 3 3 98 1 1 2 2 2 99 1 0 1 3 3 100 0 1 1 3 2
Comment