Good day
I am running regressions with several independent variables so for example PRICE = V1 + V2 + V3 + V4 + V5 + V6. Each of the independent variables have vastly differing numbers of observations. So when I summarize each variable, I get descriptive statistics based on:
V1 n=932
V2 n=546
V3 n= 872
V4 n=340
V5 n=316
V6 n=543
The means, medians, std dev etc are all based on their individual numbers of observations. However, when I run the regression Stata skips the lines if any of the variables has missing values. So the regression seems to be based just on complete lines where V1, V2, V3, V4, V5 and V6 are not missing values. So the table of results for the final regression is based on n=294.
The problem I have is I have to draw up a table of descriptive statistics for each of the variables (V1 - V6) for ONLY the variables that were used in the regression (i.e. n=294) and not the whole number of observations for each variable.
I have been going through each variable and using "summarize V1 if PRICE~=. & V1~=. & V2~=. etc. But I have to do these tables for over 30 different regressions and it is wasting a lot of time. Does anyone know how to produce descriptive statistics for variables specifically for the observations that were used in the regression?
Any help is greatly appreciated.
Many Thanks
Sean
I am running regressions with several independent variables so for example PRICE = V1 + V2 + V3 + V4 + V5 + V6. Each of the independent variables have vastly differing numbers of observations. So when I summarize each variable, I get descriptive statistics based on:
V1 n=932
V2 n=546
V3 n= 872
V4 n=340
V5 n=316
V6 n=543
The means, medians, std dev etc are all based on their individual numbers of observations. However, when I run the regression Stata skips the lines if any of the variables has missing values. So the regression seems to be based just on complete lines where V1, V2, V3, V4, V5 and V6 are not missing values. So the table of results for the final regression is based on n=294.
The problem I have is I have to draw up a table of descriptive statistics for each of the variables (V1 - V6) for ONLY the variables that were used in the regression (i.e. n=294) and not the whole number of observations for each variable.
I have been going through each variable and using "summarize V1 if PRICE~=. & V1~=. & V2~=. etc. But I have to do these tables for over 30 different regressions and it is wasting a lot of time. Does anyone know how to produce descriptive statistics for variables specifically for the observations that were used in the regression?
Any help is greatly appreciated.
Many Thanks
Sean
Comment