Dear Statalist,
I'm trying to find a method to optimally group observations based on a continuous variable. Given a variable, price, I want to find the optimal cutpoint, betacut, such that the sum of squared residuals between each observation and the group's mean is minimized.
I don't think the following is quite correct, since beta1 and beta2 are implied by the cutpoint, but the idea is something like:
, where the object of interest is betacut. Even if I were to write the problem correctly, can a non-gradient numerical search method be used with nl, since the SSR will be flat/vertical segments rather than smooth? (eg if the observations for price are 0, 1.5, 2, 3, 4, then the SSR doesn't change when betacut is moved from 2 to 2.1 to 2.2, etc, until it hits 3, at which point the SSR will instantaneously jump all at once.)
Thank you,
Andrew Maurer
I'm trying to find a method to optimally group observations based on a continuous variable. Given a variable, price, I want to find the optimal cutpoint, betacut, such that the sum of squared residuals between each observation and the group's mean is minimized.
I don't think the following is quite correct, since beta1 and beta2 are implied by the cutpoint, but the idea is something like:
Code:
sysuse auto, clear nl (price = {beta1} + {beta2}*(price < {betacut}))
Thank you,
Andrew Maurer