Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Programming issue with ppml when using iv and fixed effects

    Dear Sir/ Madam,
    I have tried everything I could under the sun and would really appreciate any help with this.

    I am using a gravity model to estimate the impact of international student stock in Australia on bilateral imports/ exports. I use panel data from 2002-2016 with the 50 countries who have the largest international student stock in Australia. I do this by sector (60 sectors), so I run 120 separate regressions.

    Using STATA 15.1, ivppml, command by Joao Silva, I get the error that I don't have enough instruments. However, I do have enough instruments, the command appears to exclude the country dummies from my instruments list that are collinear prior to excluding it in my regressor list.

    I have a loop of 60 sectors that I am running, so it would be infeasible to manually remove collinear country-dummies for each sector loop, as for each regression different country dummies are collinear and excluded.

    Code for one example sector:
    Code:
    . foreach y in a1213dairyprodpro{
      2.         eststo: ivppml `y'e6 ln_intlse3 ln_mige3 ln_nexc ln_pope6 ln_rpercap year_2003 yea
    > r_2004 year_2005 year_2006 year_2007 year_2008 year_2009 year_2010 year_2011 year_2012 year_2
    > 013 year_2014 year_2015 year_2016 rank_2 rank_3 rank_4 rank_5 rank_6 rank_7 rank_8 rank_9 ran
    > k_10 rank_11 rank_12 rank_13 rank_14 rank_15 rank_16 rank_18 rank_19 rank_20 rank_21 rank_22 
    > rank_23 rank_24 rank_25 rank_26 rank_27 rank_28 rank_29 rank_30 rank_31 rank_32 rank_33 rank_
    > 34 rank_35 rank_36 rank_37 rank_38 rank_39 rank_40 rank_41 rank_42 rank_43 rank_44 rank_45 ra
    > nk_46 rank_47 rank_48 rank_49 rank_50, inst(ln_isnawee3 ln_mige3 ln_nexc ln_pope6 ln_rpercap 
    > year_2003 year_2004 year_2005 year_2006 year_2007 year_2008 year_2009 year_2010 year_2011 yea
    > r_2012 year_2013 year_2014 year_2015 year_2016 rank_2 rank_3 rank_4 rank_5 rank_6 rank_7 rank
    > _8 rank_9 rank_10 rank_11 rank_12 rank_13 rank_14 rank_15 rank_16 rank_18 rank_19 rank_20 ran
    > k_21 rank_22 rank_23 rank_24 rank_25 rank_26 rank_27 rank_28 rank_29 rank_30 rank_31 rank_32 
    > rank_33 rank_34 rank_35 rank_36 rank_37 rank_38 rank_39 rank_40 rank_41 rank_42 rank_43 rank_
    > 44 rank_45 rank_46 rank_47 rank_48 rank_49 rank_50)
      3.         }
    
    Dropped instruments:  rank_23 rank_28 rank_30 rank_36 rank_39
    model is not identified; there are more parameters, 68, than instruments, 63
    r(481);
    Note that I have simplified my code to only include one sector (a1213dairyprodpro), instead of all 60 sectors.

    a1213dairyprodproe6- Dairy exports from Australian to country j (I use e6 to mean scaled down by 1 000 000)
    ln_intlse3- log higher ed international student stock from country j
    ln_mige3- log migrant stock from country j
    ln_nexc - log exchange rate Aus/ country j
    ln_pope6- log population in country j
    ln_rpercap- log real gdp per capita in country j
    rank- the numbers I assigned to different country string variables, denoting the "rank" in no. of international students in Aus


    When I tried to manually drop instruments for one sector, I get regression results, but at the end I get an error message

    Code:
    Instruments for equation 1: ln_isnawee3 ln_mige3 ln_nexc ln_pope6 ln_rpercapgdp year_2003
        year_2004 year_2005 year_2006 year_2007 year_2008 year_2009 year_2010 year_2011
        year_2012 year_2013 year_2014 year_2015 year_2016 rank_2 rank_4 rank_6 rank_7 rank_8
        rank_9 rank_10 rank_13 rank_14 rank_15 rank_16 rank_18 rank_19 rank_20 rank_22 rank_24
        rank_25 rank_26 rank_27 rank_31 rank_32 rank_34 rank_37 rank_40 rank_42 rank_43 rank_44
        rank_45 rank_46 rank_47 rank_48 rank_49 rank_50 _cons
    equation 1 not found
    r(303);
    [But again this is not feasible to do 120 times]

    I have also tried:
    • gmm (error: flat or discontinuous region encountered)
    • ivpoisson gmm- (error: Hessian is not positive semidefinite)
    Would greatly appreciate any help!

    Thanks!!
    Alice
    [Note, apologies my first name is Alice, but Alice Li was not available, so had to use Alicee]

  • #2
    Dear Alice,

    I am afraid I have bad news for you:

    - These IV commands are not valid for models with fixed effects;
    - Your sample very biased; if you were doing wage regression would you be happy with a sample containing only the highest earners?

    Best wishes,

    Joao

    Comment


    • #3
      Hi Joao,


      Thanks so much for replying! Could I please ask you two follow up questions?



      1. Would there be any way of estimating the equation with both iv and fixed effects with any command at all (ideally poisson)?

      I note that ivppml runs when I only use the iv and year dummies (but not when I add country dummies).

      I would like to include both because in 2016 international students figures was equivalent to about 2% of Australia's population (and that's after a boom in international student numbers), so I would need to remove a lot of "noise".





      Currently using OLS log-log fe with robust s.e., but this is not ideal (with robust s.e., I'm not even getting significance on variables like per cap gdp and population for many sectors. For international students and permanent migrants, I'm also getting (for some sectors) coefficients with unexpected signs that are not significant. With non-robust s.e., I get significant coefficients with the wrongs signs for some sectors.)
      Code:
      foreach y in a1213dairyprodpro{
          eststo: xtivreg ln_`y'e6 ln_mige3 ln_nexc ln_pope6 ln_rpercap year_2003 year_2004 year_2005 year_2006 year_2007 year_2008 year_2009 year_2010 year_2011 year_2012 year_2013 year_2014 year_2015 year_2016 (ln_intlse3=ln_intlstnawe), fe vce(robust)
          }


      2. Do you have any suggestions on what I could do? Would really appreciate it!!



      As for taking the 50 countries with the largest stock of international students (as of 2016):

      I do this because the stock of international students needs to reach a certain point for there to be an effect. Australian merchandise trade figures do not include purchases <$1000 (for imports) and <$2000 (for exports), so (in the case of imports) there can only be an effect when retailers or something larger-scale decides that there is enough demand to start selling these items. So I do believe I am capturing the entire market.



      Thanks so much,

      Alice

      Comment


      • #4
        Hi Joao Santos Silva,
        Sorry my mistake ivppml does not run when I only use the iv and year dummies, ivpoisson gmm does, and only for some of the 60 regressions (the rest again give error Hessian is not positive semidefinite).
        I was wondering if this was a STATA programming limitation, or just inherent with the ppml estimator, and why what that be?

        Thanks so much,
        Alice

        Comment


        • #5
          Dear Alice,

          I am not aware of any command that would allow you to obtain consistent estimates of a gravity equation if it includes fixed effects and estimation is done by IV.

          I am still not convinced that your sample will lead to meaningful results, but only you know exactly whether it serves your purpose.

          The difference between ivppml (which is not a supported command) and ivpoisson is that ivppml checks for the existence of the estimates and ivpoisson does not. So, by not dropping some of the instruments, ivpoisson my be leading you to unreliable results.

          I do not know why you are doing this but depending on your objectives it may be better for you to simplify the problem, for example assuming away the endogeneity problem.

          Best wishes,

          Joao

          Comment


          • #6
            Dear Joao Santos Silva ,

            Thanks so much for your help -sorry may I ask a few follow up questions again?

            (1)
            Why would I not be able to obtain consistent estimates of a gravity equation if it includes all of:
            1. iv
            2. year dummies
            3. country dummies

            Is there by any chance a journal article or the like explaining why they cannot be used simultaneously?

            (2)
            To clarify what you mean by assuming away the endogeneity problem, do you mean:
            a. Only using IV, no year dummies or country dummies, and instead just use the standard gravity controls (distance, common language etc.); or
            b. Only use the year and country dummies, and not the IV; or
            c. some other combination of iv, year dummies, country dummies and standard gravity controls?

            (and that I should continue using PPML estimator rather than anything else?)

            Note that I have a very strong IV (international student stock in North America and Western Europe)



            My objective is to understand how international student stocks affect bilateral imports/ exports. I live in Melbourne, where around 70% of CBD residents are not citizens, and there are Asian food shops everywhere around the CBD. This is owing to the large presence of international students studying at universities in the CBD. (There's a similar phenomenon in Sydney as well). Milk powder products are often missing from store shelves as Chinese students ship these in bulk to China (although the later would not be captured in my results due to this being under $2000).

            Thanks so much for everything!
            Alice

            Comment


            • #7
              Hi Joao Santos Silva,
              (From above, I am also assuming when you used the term fixed effects to mean by country dummies and year dummies, and that iv cannot be used with either).
              Thanks so much,
              Alice

              Comment


              • #8
                Dear Alice,

                On 1), models with fixed effects generally suffer from the Incidental Parameters Problem and cannot be estimated consistently. There are only a very small number of models that are immune to this plroblem, including the linear model and Poisson regression. I assume that Poisson regression with IV suffers from the IPP and therefore is not consistent.

                On 2) you can either do a or b. This would be fine for something like an undergrad final project, but not if you want to advise policy on this.

                Best wishes,

                Joao

                Comment


                • #9
                  Dear Joao Santos Silva,
                  Thanks heaps and sorry, one last clarification.
                  Although I'm not advising policy, I would like to write a thesis of that quality.
                  If you were advising policy, what would you suggest I do and why?
                  Thanks so much!
                  Alice

                  Comment


                  • #10
                    Dear Alice, I am afraid I cannot help you much; if I were tasked with that I would have to think long and hard about it may not even be able to find a solution. Best wishes, Joao

                    Comment


                    • #11
                      Dear Alice, I am afraid I cannot help you much; if I were tasked with that I would have to think long and hard about it may not even be able to find a solution. Best wishes, Joao

                      Comment


                      • #12
                        Dear Joao,
                        Thanks so much for all your help! Really grateful for this!
                        Thanks,
                        Alice

                        Comment

                        Working...
                        X