Hello everyone,
this is my first post, so please be kind and understanding if I don't meet the forum norms.
So my regression equation was: reg amount12 ib1.lng_origins c.pca_generaltrst1 ib6.religion controls
The outcome variable is amount12=amount remitted in past 12 months, lng_origins=language origins in SA such as (Sotho, Venda, Tsonga etc) and religion=Atheists, christians etc.
So the very first problem that I have is that if I only look at the amount remitted of those that remit I may have selection bias. I cannot use a heckman because my selection equation does not have a variable that is different from the second stage so the exclusion restriction is violated.
Then I talked to a professor and he said that I should simply recode the missing values in the amount remitted to zeros because those people are not remitting any amount. So I did that and I also recoded two other variables with missings to zeros that I want to include as controls because I figured that otherwise stata only takes the values into account that are non-missing but to account for selection bias it has to take all the observations into account right? These are the control that I recoded: (1) relationship to remittance receiver (2) frequency of remittances.
Now I cant use OLS because the error terms are not distributed normally and I have a loooot of zeros which is why I thought I may be able to use a zero inflated negative binomial regression. Then in inflate() I would plug in my logit regression (all variables & controls without the outcome variable) that estimated whether a person remits or not:
zinb new_amount12 ib1.pop_lngorigins c.pca_generaltrst1 ib6.religion controls, inflate(ib1.pop_lngorigins c.pca_generaltrst1 ib6.religion other controls)
Unfortunately, the inflate regression does not give me the same or similar results as the logit regression that I did already, why is that? Can I still use the coefficients that I get for amount remitted?
Please note that this is a master thesis and that it does not have to be perfect (I would like it to be but I am pretty much new to these models so I think it is very normal that it will not be perfect right away).Thank you so much for your help in advance!!
this is my first post, so please be kind and understanding if I don't meet the forum norms.
So my regression equation was: reg amount12 ib1.lng_origins c.pca_generaltrst1 ib6.religion controls
The outcome variable is amount12=amount remitted in past 12 months, lng_origins=language origins in SA such as (Sotho, Venda, Tsonga etc) and religion=Atheists, christians etc.
So the very first problem that I have is that if I only look at the amount remitted of those that remit I may have selection bias. I cannot use a heckman because my selection equation does not have a variable that is different from the second stage so the exclusion restriction is violated.
Then I talked to a professor and he said that I should simply recode the missing values in the amount remitted to zeros because those people are not remitting any amount. So I did that and I also recoded two other variables with missings to zeros that I want to include as controls because I figured that otherwise stata only takes the values into account that are non-missing but to account for selection bias it has to take all the observations into account right? These are the control that I recoded: (1) relationship to remittance receiver (2) frequency of remittances.
Now I cant use OLS because the error terms are not distributed normally and I have a loooot of zeros which is why I thought I may be able to use a zero inflated negative binomial regression. Then in inflate() I would plug in my logit regression (all variables & controls without the outcome variable) that estimated whether a person remits or not:
zinb new_amount12 ib1.pop_lngorigins c.pca_generaltrst1 ib6.religion controls, inflate(ib1.pop_lngorigins c.pca_generaltrst1 ib6.religion other controls)
Unfortunately, the inflate regression does not give me the same or similar results as the logit regression that I did already, why is that? Can I still use the coefficients that I get for amount remitted?
Please note that this is a master thesis and that it does not have to be perfect (I would like it to be but I am pretty much new to these models so I think it is very normal that it will not be perfect right away).Thank you so much for your help in advance!!
Comment