Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #76
    Dear Joao,

    Thanks for your reply. I'm still unsure why Poisson wouldn't be a good idea for my model. From another post of yours, I saw that " If you do not have many negative observations and these are of small magnitude, then ppml may still be fine". I think that's still quite my case. The only problem is that, when I run the regression without imposing that migration flows (i.e., stocks t - stocks t-1) are positive, Stata gives an error saying that migration flows "must be greater than or equal to zero".

    Your feedback is extremely appreciated.Regards,

    Ainhoa

    Comment


    • #77
      If the mean of y is negative for some countries that look more than just a few small negative numbers...

      Best wishes,

      Joao

      Comment


      • #78
        As noted before, this is the case for the specific country I'm interested in, and looking more broadly into all the countries of the sample, "only" 20% of my sample contains negative migration flow values.

        Ainhoa

        Comment


        • #79
          I would say 20% is a lot, but it all depends on the rest of the data.

          Best wishes,

          Joao

          Comment


          • #80
            Yeah...In any case, I'm still unsure on the error I get for my dependent variable, which Stata suggests that needs to be positive in order to be able to run. Apologies for so many questions, I highly appreciate your help.

            Ainhoa

            Comment


            • #81
              Ainhoa: You simply should not be using Poisson regression when your dependent variable can take on negative values. The mean function has an exponential form, and so would never predict a negative value. It's logically impossible for the exponential model to be correct.

              Moreover, the quasi-log likelihood is generally not going to behave well with negative values. Why are you insisting on using an inappropriate method? You're better off with a linear model that can predict positive and negative net migration flows.

              If you want to use the exponential model estimated by Poisson quasi-MLE, you could use the ratio of migration flows, provided you never divide by zero.

              Comment


              • #82
                Dear Jeff Wooldridge,

                I agree that Poisson is unlikely to be useful in this case, but I do not think I agree with your initial sentence:

                You simply should not be using Poisson regression when your dependent variable can take on negative values. The mean function has an exponential form, and so would never predict a negative value. It's logically impossible for the exponential model to be correct.
                If this was correct, logit and probit would not be appropriate for binary data because their mean function can never take the values 0 or 1. Likewise, Poisson regression may be valid even when the dependent variable takes negative values. To illustrate this, consider the following code where I generate data with an exponential mean function and in which the dependent variable can take negative values; I then use Poisson regression (implemented using the gmm command) to estimate the parameters:

                Code:
                cls
                clear all
                capture program drop ppml_gmm
                
                program define ppml_gmm
                version 14
                syntax varlist(numeric min=1) [if] [in] , [ vce(string)]
                marksample touse
                tempname y _rhs
                gettoken y _rhs: varlist
                su `y' if `touse', meanonly
                local mu=log(r(mean))  
                gmm (`y' - exp({xb: `_rhs'} + {b0})) if `touse' , ///
                instruments(`_rhs') from (b0 `mu') onestep  ///
                deriv(/xb = -1*exp({xb:} + {b0})) deriv(/b0 = -1*exp({xb:} + {b0}))
                end
                
                set obs 1000
                set seed 123456
                g x=rnormal()
                g y=exp(1+x)+rnormal()
                
                ppml_gmm y x
                su y
                Would you agree that the relevant condition for standard Poisson regression to be valid is that the mean function has an exponential form, and that there is no restriction on the range of the dependent variable?

                Best wishes,

                Joao

                Comment


                • #83
                  Joao: As a technical point, I agree. As a practical point, I think it's a bad idea to use Poisson regression with an exponential mean function when outcomes can be negative. And it seems weird to push the virtues of Poisson regression for nonnegative outcomes relative to a linear model and then not admit that the linear model is likely a better choice when the outcome can take on both values.

                  Of course you can generate data where the estimates are still consistent. But how realistic is that? Consider the migration example. What if Ainhoa reversed the order of the countries in the definition of the migration flows? Now 80% would be negative values. Is Poisson regression still a good idea? I doubt that it would even converge in the case because you might actually be finding a minimum rather than a maximum of the quasi-likelihood function. And yet, what has changed about the problem? Do the estimates of the parameters change in a predictable way when you replace y by -y? We know what will happen with a linear model.

                  And with an unobserved effect, it's especially weird with multiplicative heterogeneity. With a linear model, we know exactly what will happen if we replace y with -y.

                  I don't like the analogy with logit/probit. We know in that case we're modeling a conditional probability, which needs to be between zero and one. And, we can use the estimated probabilities to obtain a sensible prediction rule, such as predict a zero if phat < 0.5. Prediction is certainly not everything, but combined with the other problems mentioned above I don't know why one would use E(y|x) = exp(x*b) when y can be negative.

                  Comment


                  • #84
                    Dear Jeff Wooldridge,

                    I am glad we agree on the main point. I also agree that in this case Poisson regression does not appear be appropriate and that was exactly what I was saying above (see, e.g., #75).

                    More generally, I also agree that in practice using Poisson regression when the outcome can be negative is unlikely to be a good idea. More precisely, I am not aware of any example where the data can take negative values and Poisson regression is a sensible approach.

                    The analogy with the logit/probit is possibly a question of taste, but even in standard count data models with a mass point at zero the mean cannot be zero. So, arguing that it is logically impossible for a model to be correct when the mean cannot take some values in the support of the dependent variable looks like going a bit too far.

                    Best wishes and thanks for always providing food for thought,

                    Joao

                    Comment


                    • #85
                      Joao: I agree on the point about the mean and the support of y. In fact, the first example I teach in probability is to roll a fair, 6-sided die and note the expected value is 3.5. Thanks for emphasizing that. So it’s something else bothering me that I didn’t articulate well. I guess with net migration and trade flows one would expect large magnitudes if both signs, and its arbitrary which it is for a given pair.

                      Comment


                      • #86
                        Joao: I agree on the point about the mean and the support of y. In fact, the first example I teach in probability is to roll a fair, 6-sided die and note the expected value is 3.5. I guess I find it weird if the mean function cannot span the upper and lower bounds of the support no matter the parameter values. We know for a count variable we can make the mean as close to zero as we want. If y can take on -10 then we can’t get close to that with an exponential mean function. Maybe I’m limited by my imagination. With net migration and trade flows, one would expect large magnitudes if both signs, and its arbitrary which it is for a given pair.

                        Comment


                        • #87
                          Dear Jeff,

                          Modern trade models always work with exports and these are non-negative; I believe that standard migration models also generally work with gross flows which should be a standard count. In both of these cases Poisson is ideal.

                          The migration study discussed early in this thread deals with net migration flows and in that case we agree that Poisson is not a good idea. A similar situation occurs with FDI because I understand that the data available is for net flows and that can be negative as well. Again, in that case Poisson is unlikely to work.

                          In short, although the fact that y has negative values does not imply that Poisson regression is inadequate, I am not aware of any example where y has negative values and Poisson is a sensible approach. Maybe one of the readers in this forum can provide an example?

                          Best wishes and thank you for contributing to this forum,

                          Joao

                          Comment


                          • #88
                            Dear Joao and Jeff,

                            Thanks a million for your feedback. I've changed the specification of my dependent variable in order to cover your concerns, which make a lot of sense, especially considering that the country I'm particularly interested in contains a number of negative flows with one of its main migration partners.

                            Kind regards,

                            Ainhoa

                            Comment


                            • #89
                              hello, I am a student and I am preparing a work on the gravity model to determine the exports of my country to a 31 partner countries over 16 years. The data contains zero values, and I think the PPML estimate is adequate for this situation.
                              I have read the articles on this topic as well as the comments on statalist. it was very instructive but unfortunately when I apply the commands for panel data nothing works. in addition, the time-invariable data are removed from the model, which is very important for my study. I read that we should not use country fixed effects because they include time invariant effects.
                              I have a single exporter and several importers, do I use the importing fixed effects only? or exporter too?
                              what am I using, robut or clauster and what are the options?
                              If there is someone who can help me set the commands I would be grateful. otherwise, documentation on the application of the PPML estimator on panel data.
                              Thank you for your help.
                              Cordially.

                              Comment


                              • #90
                                Hello.

                                I am a Master's student currently researching the impact of colonial penetration on Spanish trade via the trade-immigration link, distinguishing at the HS-6 level so as to classify goods on the basis of their trade elasticities. As a brief context, my data includes 2002-2019 yearly data and around 81,000 observations (inclusive of zero traded flows). Thus far, I have utilized -xtreg- with time dummies as follows:

                                Code:
                                xtreg LImports LDistcap Contig WTO_both GATT_both Lrgdp Com_lang Low_Col Int_Col High_Col Lmignocol i.Year, robust cluster(LDistcap)
                                Given the known issues of biased estimates in OLS, would it make sense to additionally run a -ppml- or -xtpoisson- model?

                                Thank you in advance,

                                BM.





                                Comment

                                Working...
                                X