Dear all,
I would like to have your feedback on some general (rather than Stata-related) questions with regard to linear splines.
1. My current approach is to explore the shape of the relationship between a continuous variable x and an outcome y by first regressing y on a restricted cubic spline (RCS) of x with predefined number and placement of knots; if the non-linear test (e.g. an overall test on the non-linear components of the spline) is significant, then proceed to linear spline approach, otherwise stick with linear regression of y on x. Would you agree?
2. Regarding the subsequent linear spline analysis, I see that some authors select one knot based on visual ispection of the RCS fit; I find this unfeasible when you have multiple variables and also not sure this is methodologically correct, so would stick with two prespecified knots (e.g. 30th and 60th percentile). Suppose one is interested in testing the interaction between x and z on y, with z as a binary variable, and we have already established through RCS that the relation between x and y is non-linear. Would you trim x to values that are common to the levels of y, or analyze as is? Also, would you keep the knots for the linear spline as found in the overall sample of should those be specific to the levels of z?
Thanks,
Manuel
I would like to have your feedback on some general (rather than Stata-related) questions with regard to linear splines.
1. My current approach is to explore the shape of the relationship between a continuous variable x and an outcome y by first regressing y on a restricted cubic spline (RCS) of x with predefined number and placement of knots; if the non-linear test (e.g. an overall test on the non-linear components of the spline) is significant, then proceed to linear spline approach, otherwise stick with linear regression of y on x. Would you agree?
2. Regarding the subsequent linear spline analysis, I see that some authors select one knot based on visual ispection of the RCS fit; I find this unfeasible when you have multiple variables and also not sure this is methodologically correct, so would stick with two prespecified knots (e.g. 30th and 60th percentile). Suppose one is interested in testing the interaction between x and z on y, with z as a binary variable, and we have already established through RCS that the relation between x and y is non-linear. Would you trim x to values that are common to the levels of y, or analyze as is? Also, would you keep the knots for the linear spline as found in the overall sample of should those be specific to the levels of z?
Thanks,
Manuel
Comment