Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • modeling problem

    Dear users,

    I am a PhD student in Belgium since 2023. I am new here and I still need to learn how things work, but I guess it is the right place to write this post.

    Since I have started, my knowledge in econometrics has increased slightly, which is very useful for my research, but I do not pretend to be an expert so I am sorry for any mistakes. I am writing this post because I have some difficulties to model a framework that I have in mind. Indeed, I want to assess the differences in drivers of 2 binary decisions that are linked: decision 1 is made (0/1), and based on it (if decision 1 is equal to 1), decision 2 is made (0/1). The aim is to compare those that decide 0 with those that decide 1, in both steps. I wanted to resort on panel data but the issue is that non-linear models identify effects based on switchers and so I am unable to rely on the comparison between never switchers (always 0) and switchers (that rarely switch), and this can be seen in both steps again. To account for the link between the 2 binary decisions, I have thought of using two-steps models. First, the heckprobit model, which is a bivariate probit with selection, if I am not wrong. This model helps to consider the first decision as a selection process (the sequential nature of the framework). However, if I use panel data, only the switchers will be used in the estimation of the likelihood. For the outcome equation, that is not a problem as the selection in the sample is being corrected thanks to the first step equation. The first step is helping in correcting the selection in the second step, while it is itself subject to selection as in panel data, only switchers will contribute to the estimation. Finding a good instrument is also not an easy task. Then, simpler bivariate models are also subject to the same kind of drawbacks, if we do not assume selection but still correlated errors. Hurdle models like the Cragg model relies on count type of dependent variables. The second decision could be refounded this way but the issue here is that we have a selection problem as it uses only one kind of never switchers (only the units that passed the the first decision). This is also a problem if I employ a rare event model because a joint estimation of multiple equation does not exist, so regressing the 2 binary decisions would be done separately. Regressing separately the decisions would leave an important part of the process, the fact that both decisions are linked and one is dependent of the other. Duration models were considered, nonetheless, they are not quite precisely estimating what is desired as they evaluate the timing of the event instead of the determinants. Using the Mundlak device would allow to partially introduce the comparison with the never switchers thanks to the averaged covariates. Nevertheless, not all models allow to introduce random effects and if they do, it sometimes leads to non-convergence (due to the low rate of 1s). It also reflects correlation rather than causation, but that would be not an important issue I guess.
    I am starting now to think of relying on between variation and hence cross-sectional data instead of staying with panel data, although the risk is to reduce drastically the sample. Units of observation are countries, that are not numerous.

    To summarize, here are questions:
    - Do you know any kind of model that is fitting the framework using panel data ?
    - Do the non-informative nature of 0s, what is the point of using panel data for binary choice model, that would lead to selection bias as e regress only on a sample of switchers ?
    - Is using the between variation in the context of countries genuinely problematic ?

    Thank you in advance for your help.

    Best regards,
    Ammar suliman

  • #2
    maybe biprobit in the cross section

    Comment

    Working...
    X