Dear All,
I would be very grateful if anyone could help me. Maybe other people have the same issue.
I have cross sectional micro-data, 2000 obs. I want to look at the effect of a categorical variable (three categories) on a continuous variable. I have strong reasons to think that this categorical variable is endogenous.
So I have:
Y = B0 + B1*CAT_2 + B2*CAT_3 + exogenous variables
It could be argued that the categorical variable is ordered in the sense that there is a latent variable increasing over the categories (although I could put the case that the variable is not ordered). So, I need an ordered or multinomial first stage.I have two instruments (but I could stretch it to three instruments, although my third isn’t as plausible).
So my question is how to estimate the system to infer causality. If I just had one binary endogenous regressor I could use treatreg, or biprobit or just estimate the whole thing as LPMs.
Could I just bootstrap the following?
. oprobit CAT Z1 Z2 X
. predict PCAT_1 PCAT_2 PCAT_3, p
. regress Y CAT_2 CAT_3 X
or run the following series of LPMs:
. ivregress 2sls Y X (CAT_2 CAT_3 = Z1 Z2)
but I don’t think ivregress allows for the fact that the categories are mutually exclusive and ordered
I have looked at cmp but I am not sure how to execute this
I have run:
. cmp (y = X CAT_2 CAT_3) (CAT_2 = X Z1 Z2) (CAT_3 = X Z1 Z2), ind($cmp_cont $cmp_probit $cmp_probit)
but again I don’t think this allows for the fact that the categories are mutually exclusive and ordered
So I think I need an ordered (or multinomial first stage) but I don’t know what to do..
Something like this?
cmp (y = X ?) (CAT = X Z1 Z2), ind($cmp_cont $cmp_oprobit)
What should I put in the first equation? How would I interpet it.
Any help would be greatly appreciated. It's my first time posting here so apologies if I have not been clear enough.
Best wishes,
Vincent
I would be very grateful if anyone could help me. Maybe other people have the same issue.
I have cross sectional micro-data, 2000 obs. I want to look at the effect of a categorical variable (three categories) on a continuous variable. I have strong reasons to think that this categorical variable is endogenous.
So I have:
Y = B0 + B1*CAT_2 + B2*CAT_3 + exogenous variables
It could be argued that the categorical variable is ordered in the sense that there is a latent variable increasing over the categories (although I could put the case that the variable is not ordered). So, I need an ordered or multinomial first stage.I have two instruments (but I could stretch it to three instruments, although my third isn’t as plausible).
So my question is how to estimate the system to infer causality. If I just had one binary endogenous regressor I could use treatreg, or biprobit or just estimate the whole thing as LPMs.
Could I just bootstrap the following?
. oprobit CAT Z1 Z2 X
. predict PCAT_1 PCAT_2 PCAT_3, p
. regress Y CAT_2 CAT_3 X
or run the following series of LPMs:
. ivregress 2sls Y X (CAT_2 CAT_3 = Z1 Z2)
but I don’t think ivregress allows for the fact that the categories are mutually exclusive and ordered
I have looked at cmp but I am not sure how to execute this
I have run:
. cmp (y = X CAT_2 CAT_3) (CAT_2 = X Z1 Z2) (CAT_3 = X Z1 Z2), ind($cmp_cont $cmp_probit $cmp_probit)
but again I don’t think this allows for the fact that the categories are mutually exclusive and ordered
So I think I need an ordered (or multinomial first stage) but I don’t know what to do..
Something like this?
cmp (y = X ?) (CAT = X Z1 Z2), ind($cmp_cont $cmp_oprobit)
What should I put in the first equation? How would I interpet it.
Any help would be greatly appreciated. It's my first time posting here so apologies if I have not been clear enough.
Best wishes,
Vincent
Comment