Dear statalisters: hi to all.
My question is about the back transformation of estimates (coefficients, s.e. and p-values) after a regression that used the orthogonal transformation of variables with the orthog command.
In the statalist there are only 3-4 questions on orthogonalization, but none of them have dealt with this.
I used orthog feature of STATA to transform the variables in the regression (say oXs) and used the , matrix(R) command and matsave/matload to store /reload the matrix used for transformation. I did this because of high correlation and I needed to include interactions that created additional VIF problems. So I transformed all the Xs that were not dummies.
Code:
orthog X1 X2 X3 X4 X5 , gen ( oX1 oX2 oX3 oX4 oX5 ) matrix(R)
Code:
matsave R, saving matload R, saving over
Code:
matrix B=e(b)[1,1..13] matrix cte=e(b)[1,20] matrix Bprim=(B, cte) matrix b = Bprim*inv(R)' matrix list b
My doubts are related to s.e. of coefficients and its t-statistics or p-values:
1. Some but not all the Xs are orthogonalized (for instance, I have dummy variables that don't need orthogonalization). Since the orthogonalization includes a constant, does it mean that the intercept of the orthogonalized regression should also be back transformed with the matrix? If you look at my code I really did it, since I saw the example in the help file and it includes the intercept, which is different to the orthogonalized estimate. Should the intercept be backtransformed when there is some orthogonalized variables in the regression (but not all of them, dummy variables are not transformed)?
2. I was able to present the coefficients back transformed. My question is whether standard errors of orthogonalized variables are the same for the back-transformed variables or whether I should apply the same code to obtain the standard error in the original scale of X variables. I noticed that the variance of a original X and the orthogonalized X are not the same (orthogonal variables have of course mean=0 and variance=1 or very close to it). Since s.e. of an estimate is equal to [ SSR/(n-2) * sqr(1/Var(X') ] and the Var(X') is not equal to Var(X), should I apply the same algebra to back transform s.e. ? If not, how could I compute that?
3. Should I compute p-values of the back transformed estimates or will they be the same as the orthogonal variables regression?. I mean, depending on the response to (2), I could compute the p-value of the back-transformed estimate (that should be the same as the orthogonalized variables regression). I have s.e. and the intercept, so I can compute the t-value and its p-value. But I'm unsure of whether this is needed in case the p-value be exactly the same.
Sorry if my question or code do not follow the FAQs, but it's my first time and did not know how to include that.
Thanks in advance
Comment