How does one compute a confidence interval for the mean of the predicted probabilities following a binary logistic regression? Do I e.g. use some kind of bootstrapping approach or the margins commando?
To be more specific, I have a very large sample of individuals from the same country but from different geographic regions that differs widely in the number of observations in my sample. Using this sample, I run a binary logistic regression and predict the individual probabilities for the outcome. Then, I find the mean of the predicted probabilities for different geographic regions. The problem is how to calculate the confidence interval for the mean of the predicted probabilities for each geographic region.
In the abovementioned, I have sample data. Should the calculation of the confidence interval for the mean of predicted probabilities be different if I instead have data for all individuals in the country? (Some argue that it is still relevant to talk about sampling error in this situation since one could view the population as a sample from some kind of super population.)
To be more specific, I have a very large sample of individuals from the same country but from different geographic regions that differs widely in the number of observations in my sample. Using this sample, I run a binary logistic regression and predict the individual probabilities for the outcome. Then, I find the mean of the predicted probabilities for different geographic regions. The problem is how to calculate the confidence interval for the mean of the predicted probabilities for each geographic region.
In the abovementioned, I have sample data. Should the calculation of the confidence interval for the mean of predicted probabilities be different if I instead have data for all individuals in the country? (Some argue that it is still relevant to talk about sampling error in this situation since one could view the population as a sample from some kind of super population.)
Comment