For certain variables, corr and reg produce different correlation coefficients that cannot be explained by rounding error. For example:
On Stata 17, (1) is 0.718 and (2) is 0.729.
and
show that this behavior is not explained by failure of standardization. Both corr and reg use the same observations. What can explain this disparity? It must be something to do with these variables in particular, as for most variables (e.g. those in auto, nlsw88, or this dataset), the correlations produced by corr and reg are almost identical up to rounding error.
Code:
webuse lifeexp, clear * Correlation coefficient from corr (1) corr lexp gnppc * To get the correlation coefficient from reg, first standardize both variables to have mean 0 and standard deviation 1 egen std_lexp = std(lexp) egen std_gnppc = std(gnppc) * Correlation coefficient from reg (2) reg std_lexp std_gnppc
Code:
summarize std_lexp std_gnppc
Code:
corr std_lexp std_gnppc

Comment