Hello,
I am using Stata 14. I have a panel dataset - firms and years. I am running a mixed model which includes both fixed and random effects.
My total sample size is 52,801. The data set has missing values so entire dataset is not used when I run mixed model. The mixed model output shows the #observations used is 40595.
I ran the "predict ebX1,reffects" command to obtain BLUPS for random component of independent variable X1. Now since the mixed model analysis used 40,595 observations out of the total of 52,801 observations, I assumed that BLUPS would be created for exactly 40,595 observations.
However, when I summarize ebX1 (command used: summarize ebX1), I get #obs =51032 in the summary table. This #obs is what I do not understand. So now if I run the command as "summ ebX1 if e(sample)" then I get summary statistics for the 40,595 observations. So I am able to get the correct summary statistics but what I do not understand is why were BLUPS calculated for 51,032 (>40,595) observations in the first place?
Also, if I now want to say plot a histogram of the BLUPS using command "hist ebX1, freq", the graph is plotted for 51,032 observations which are incorrect for me. And I can not use "if e(sample)" with the histogram command. I can create an indicator variable to mark the 40,595 observations and go from there but I am curious why BLUPS were calculated for more observations than that.
thanks.
I am using Stata 14. I have a panel dataset - firms and years. I am running a mixed model which includes both fixed and random effects.
My total sample size is 52,801. The data set has missing values so entire dataset is not used when I run mixed model. The mixed model output shows the #observations used is 40595.
I ran the "predict ebX1,reffects" command to obtain BLUPS for random component of independent variable X1. Now since the mixed model analysis used 40,595 observations out of the total of 52,801 observations, I assumed that BLUPS would be created for exactly 40,595 observations.
However, when I summarize ebX1 (command used: summarize ebX1), I get #obs =51032 in the summary table. This #obs is what I do not understand. So now if I run the command as "summ ebX1 if e(sample)" then I get summary statistics for the 40,595 observations. So I am able to get the correct summary statistics but what I do not understand is why were BLUPS calculated for 51,032 (>40,595) observations in the first place?
Also, if I now want to say plot a histogram of the BLUPS using command "hist ebX1, freq", the graph is plotted for 51,032 observations which are incorrect for me. And I can not use "if e(sample)" with the histogram command. I can create an indicator variable to mark the 40,595 observations and go from there but I am curious why BLUPS were calculated for more observations than that.
thanks.
Comment