Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PPML,panel data

    Dear Statalist users,

    I'm preparing my master thesis, with the objective to assess the impact of factor on internationl patent collaborations, with a dataset for 14 countries, along 23 years (panel). I would like to estimate it with ppml estimator but I do not know how to include ppml and panel data together.

    I have run the following codes:

    gen code=0
    replace code=1 if country=="CA"
    replace code=2 if country=="CN"
    replace code=3 if country=="DE"
    replace code=4 if country=="GB"
    replace code=5 if country=="HK"
    replace code=6 if country=="IL"
    replace code=7 if country=="IN"
    replace code=8 if country=="JP"
    replace code=9 if country=="KR"
    replace code=10 if country=="MY"
    replace code=11 if country=="NL"
    replace code=12 if country=="SG"
    replace code=13 if country=="TH"
    replace code=14 if country=="US"
    xtset code year

    (1) ppml dep.var. indep.var. year dummies. country dummies.
    (2) ppml dep.var. indep.var. year dummies.
    (3) ppml dep.var. indep.var.

    I’m wondering whether the (1) ,(2) and (3) is correct, or maybe there have other code.

    Thank you all in advance.
    Best regards,
    Jason Hsu
    Attached Files

  • #2
    Dear Jason,

    First of all, thanks for using -ppml-

    What you are doing is correct, but in the first case you can speed up the estimation by using
    Code:
    xtset country
    xtpoisson dep.var. indep.var. year dummies, fe
    All the best,

    Joao

    Comment


    • #3
      Dear João,

      Thank you very much for the reply

      I have another question.

      I have use panel data and run the following codes:


      xtset country year
      (1) ppml dep.var. indep.var
      (2) xtpoisson dep.var indep.var

      Results of regressions (1) and regressions (2) are not the same.

      How to account for panel data with –ppml-

      Thank you!
      All the best,
      Jason Hsu.

      Comment


      • #4
        Dear Jason,

        The commands that should produce the same results are as follows:
        Code:
        xi:ppml dep.var. indep.var i.country
        and
        Code:
        xtset country year
        xtpoisson dep.var indep.var
        All the best,

        Joao

        Comment


        • #5
          Dear João,

          Thank you for your time and attention.

          I have run the following two codes:

          Code:
          xi:ppml dep.var. indep.var i.country

          and

          Code:
          xtset country year
          xtpoisson dep.var indep.var



          I encountered some problems.

          My variables were dropped and omitted.

          How do i deal with it?

          My best regards
          Jason Hsu.
          Attached Files

          Comment


          • #6
            Jason: Probably some of your variables don't vary over time for any country, and so Stata decides arbitrarily to drop some variables -- in this case, some of the year dummies. This is why you should use xtpoisson with the fe option. Then you will know for sure some variables have no time variation because they will be dropped. xtpoisson uses a kind of within transformation rather than estimating coefficients on dummmies.

            Also, you should use a robust variance matrix:

            xtpoisson y x1 ... xK, fe vce(robust)

            With small N (N = 14) this estimator is difficult to justify, but it's better than assuming the Poisson distributional assumption holds and that you don't have serial correlation. Why do you insist on using ppml when the xtpoisson command now does what you want?

            As a final comment, I wouldn't believe most results unless you include a full set of year dummies:

            xtpoisson y x1 ... xK i.year, fe vce(robust)

            JW

            Comment


            • #7
              Sorry, Jason, in my second post I forgot the -fe- option in the -xtpoisson- command. If you do this, -ppml- and -xtpoisson- will give you exactly the same estimates (but not the same standard errors, more on this below).

              About the dropped variables, if you include the country dummies, or country fixed effects, variables that vary only by country (such as distance) will be dropped.

              Finally, on Jeff's question about why it may be interesting to use -ppml- instead of -xtpoisson, the following example may help to clarify the usefulness of -ppml-

              Code:
              use http://privatewww.essex.ac.uk/~jmcss/mock
              xi: ppml y x z i.w
              xtset w
              xtpoisson y x z, fe
              This example illustrates three points:

              a) The estimates of the coefficient on x are the same; this is as expected;

              b) -xtpoisson- does not recognize that the coefficient on z is not identified and tries to estimate it; -ppml- correctly drops that regressor and the observations with z==1;

              c) As far as I understand, -xtpoisson- with robust standard errors always clusters by the variable defining the panel, which may not make sense (as is the case here). That is, -xtpoisson- does not allow you to compute the usual robust standard errors, which are the default in -ppml-.

              So, although it is computationally more expensive, there are cases where -ppml- is preferable to -xtpoisson-, even in situations where in principle the results should be the same.

              All the best,

              Joao
              Last edited by Joao Santos Silva; 11 Jul 2015, 11:53. Reason: Included example and extended discussion.

              Comment


              • #8
                To:Jeff

                Thank you very much for the reply.

                I will try your suggestions.

                Why do I insist on using ppml.

                I am wondering results of –ppml- and –xtpoisson- the same in panel data ?


                To: João

                Again, thank you for your time and attention.

                The example helps me to clarify the usefulness of –ppml-

                Best regards,
                Jason Hsu.

                Comment


                • #9
                  PS: if someone tries to replicate the example in #7, please note that the data is now available here: http://personal.lse.ac.uk/tenreyro/mock.dta

                  Comment


                  • #10
                    hello everyone,
                    I used this xi:ppml dep V inde V i.country i.year for my analysis but it states that "varlist required" . why is that?

                    Comment


                    • #11
                      Dear Dewmi,

                      Please show us exactly what you typed.

                      Joao

                      Comment


                      • #12
                        Hello Joao,
                        Thank you very much for your kind reponses in this forum. I have learnt a lot from this topic http://www.statalist.org/forums/foru...pml-panel-data. But, I still have some more questions with my own data. I hope to receive answer from you as well as other stata experts here.
                        I analyze determinants of bilateral FDI between 40 countries over 11 years. FDI flows for pair A to B is different from B to A. I have 40x 39 = 1560 pairs. My data have a lots of zeros and thus I want to use PPML estimator. Here is what I did:
                        * panel data:
                        xtset pair year
                        * ppml estimation
                        xi: ppml FDI indvars i.year i.pair, cluster(dist) // (1) This code doest not work as calculation is over the matsize for my stata IC version. But it works if I drop the i.pair dummies:
                        xi: ppml FDI indvars i.year, cluster(dist) // (2)
                        My questions are as follows:
                        1. Should I include the dummies for pairs of countries in the model? The dummy for AB is different from the one for BA
                        2. If I want to include the dummies for pairs and my STATA does not work with (1), is it true that the following equation does the same:
                        xtpoisson realstock $list2 gdpdif year*, fe // but in this case, the cluster variable is not distance but pair ? (3)
                        3. I am confused by the robust option. If I add 'robust' to equation (3), all variable of interests become statistically insignificant. Meanwhile, with ppml the robust standard errors are the default (and this is why robust option is not allowed with ppml?), most of my independent variables are significant as I want. (but of course this is ppml without dummies for pairs). Could you pls give me some advice of whether or not use the robust option here?
                        4. Between (1), (2), (3), which one do you recommend me to use for my estimation? I use RESET test for all and p-value for (2) is 0.001, and for (3) with robust is 0.07 . What should I do in this situation?
                        Thank you very much for your time and I hope to hear from you soon!
                        Best regards,
                        Ann




                        Last edited by Ann Ng; 09 Apr 2017, 19:07.

                        Comment


                        • #13
                          Dear Joao,

                          how can I run the RESET test in a panel poisson pseudo-maximum likelihood model? The estat ovtest command does not run.

                          Best regards

                          Comment


                          • #14
                            Dear Helen Makrin,

                            I believe we describe how to do it in our webpage.

                            Best wishes,

                            Joao

                            Comment


                            • #15
                              Dear Ann Ng.

                              1 - Whether or not you need to include the pair fixed effects is a modeling question and depends on what you want to do, so only you can answer that question.

                              2 - Indeed, if you want to include the pair fixed effects you can use -xtpoisson- and cluster by pair.

                              3 - You always need to cluster by pair (or distance). As you say, by default -ppml- reports robust standard errors but these are not clustered by pair, so you need to explicitly use the clustering option.

                              4 - (1) and (3) should give you the same results; the choice between these and (2) is really up to you.

                              Best wishes,

                              Joao

                              Comment

                              Working...
                              X