Dear Community,
I'm really stuck trying to solve the issue with the logit regression that I'm running. My dataset consists of around 32 independent variables (most of which are dummy variables, many of them are category ones (e.g. I have 3 categories, I'm using one as a reference one and 2 dummy variables), some of the variables are taking values 0-200+). I have around 100 observations (I suspect this small number of observations might be the reason for the problem that i'm facing).
Dummy variables are: sr_dummy, human, age5, age40, age50 (category variables, reference one is not included), freq115, freq115_150, freq150_11000, freq1100000 (category variables, reference one is not included), treat_low, treat_high ((category variables, reference one is not included)), inn, devst_mid, devst_late ((category variables, reference one is not included)), org, fund, goal_40k_300k, goal_1m (category variables, reference one is not included), qual, eff, phtm vid, res, plat, cure
Regular variables (possible values (min-max)): phd (0-5), wmort (0-100), wleng (0-1706), wcomm (0-303), wtwit (0-2155), wupd(0-28), wback(0-2083),
When running a logit regression, some of the variables are omitted, while there are no coefficients at all and pseudo r2 is 1. I cannot understand the reason for that. The first two photos are screenshots of the logit regression.


However, when I run the probit regression, it takes couple minute for Stata to process it (16,000 iterations). After 16,000 iterations (not concave), it says that no convergence was achieved. The pseudo R2 is also 1.

.........


I cannot wrap my mind around why logit and probit have such a different number of iterations. I also cannot understand why pseudo r2 is that high, why so many variables are omitted, and why there are coefficients only for some of the variables?
Would really appreciate if you could clarify some of the questions above.
Cheers
I'm really stuck trying to solve the issue with the logit regression that I'm running. My dataset consists of around 32 independent variables (most of which are dummy variables, many of them are category ones (e.g. I have 3 categories, I'm using one as a reference one and 2 dummy variables), some of the variables are taking values 0-200+). I have around 100 observations (I suspect this small number of observations might be the reason for the problem that i'm facing).
Dummy variables are: sr_dummy, human, age5, age40, age50 (category variables, reference one is not included), freq115, freq115_150, freq150_11000, freq1100000 (category variables, reference one is not included), treat_low, treat_high ((category variables, reference one is not included)), inn, devst_mid, devst_late ((category variables, reference one is not included)), org, fund, goal_40k_300k, goal_1m (category variables, reference one is not included), qual, eff, phtm vid, res, plat, cure
Regular variables (possible values (min-max)): phd (0-5), wmort (0-100), wleng (0-1706), wcomm (0-303), wtwit (0-2155), wupd(0-28), wback(0-2083),
When running a logit regression, some of the variables are omitted, while there are no coefficients at all and pseudo r2 is 1. I cannot understand the reason for that. The first two photos are screenshots of the logit regression.
However, when I run the probit regression, it takes couple minute for Stata to process it (16,000 iterations). After 16,000 iterations (not concave), it says that no convergence was achieved. The pseudo R2 is also 1.
.........
I cannot wrap my mind around why logit and probit have such a different number of iterations. I also cannot understand why pseudo r2 is that high, why so many variables are omitted, and why there are coefficients only for some of the variables?
Would really appreciate if you could clarify some of the questions above.
Cheers
Comment