Hello,
Maybe it's a stupid question but a friend asked me for help on Stata. We have for each individual a time and a percentage of completion.
He first asked me to find a regression to estimate the percentage versus time curve. I did this and found Y = B0 + B1 * ln (t) to be very good.
Now he asks me the speed with respect to time, and if we follow the logic it would suffice to differentiate Y with t. This would give the function B1 * (1 / t) as the estimated speed
Is it good? Or, is it better to estimate the speed as being (V1-V0)/(T1-T0) for each time (For example if percentage is 0% in second 0 and 20% in second 1, the speed would be 20%/s at the beggining)? And then do a regression with this new variable?
Here is what I would have done.
Maybe it's a stupid question but a friend asked me for help on Stata. We have for each individual a time and a percentage of completion.
He first asked me to find a regression to estimate the percentage versus time curve. I did this and found Y = B0 + B1 * ln (t) to be very good.
Now he asks me the speed with respect to time, and if we follow the logic it would suffice to differentiate Y with t. This would give the function B1 * (1 / t) as the estimated speed
Is it good? Or, is it better to estimate the speed as being (V1-V0)/(T1-T0) for each time (For example if percentage is 0% in second 0 and 20% in second 1, the speed would be 20%/s at the beggining)? And then do a regression with this new variable?
Here is what I would have done.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float id int time float(gender perc) 1 0 1 0 1 1 1 15 1 2 1 23 1 3 1 28 1 4 1 32 1 5 1 36 1 6 1 39 1 7 1 41 1 8 1 43 1 9 1 44 1 10 1 46 1 11 1 46 1 12 1 47 1 13 1 48 1 14 1 49 1 15 1 50 1 16 1 50 1 17 1 50 1 18 1 51 1 19 1 51 1 20 1 51 2 0 1 0 2 1 1 10 2 2 1 15 2 3 1 20 2 4 1 23 2 5 1 26 2 6 1 29 2 7 1 31 2 8 1 34 2 9 1 36 2 10 1 38 2 11 1 39 2 12 1 41 2 13 1 42 2 14 1 43 2 15 1 45 2 16 1 47 2 17 1 48 2 18 1 49 2 19 1 50 2 20 1 51 3 0 0 0 3 1 0 17 3 2 0 27 3 3 0 35 3 4 0 41 3 5 0 46 3 6 0 50 3 7 0 53 3 8 0 55 3 9 0 58 3 10 0 61 3 11 0 63 3 12 0 65 3 13 0 68 3 14 0 70 3 15 0 71 3 16 0 72 3 17 0 74 3 18 0 75 3 19 0 76 3 20 0 77 4 0 0 0 4 1 0 14 4 2 0 20 4 3 0 24 4 4 0 29 4 5 0 33 4 6 0 36 4 7 0 39 4 8 0 40 4 9 0 43 4 10 0 45 4 11 0 47 4 12 0 47 4 13 0 47 4 14 0 48 4 15 0 49 4 16 0 48 4 17 0 48 4 18 0 47 4 19 0 48 4 20 0 49 end gen ln_time=ln(time) reg perc ln_time if gender==1 gen reg= _b[_cons]+_b[ln_time]*ln_time if gender==1 gen reg_speed=_b[ln_time]/time if gender==1 reg perc ln_time if gender==0 replace reg= _b[_cons]+_b[ln_time]*ln_time if gender==0 replace reg_speed=_b[ln_time]/time if gender==0 gen real_speed=. replace real_speed=(perc[_n]-perc[_n-1])/1 if time>0