Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • International Trade_Gravity Regression

    Hello Everyone,

    I am running an empirical analysis on the effects of 2 Countries, in this case, Bulgaria and Romania, joining the EU and how it affects their imports and exports from countries that had existing FTA agreements with the EU using the dataset from CEPII.
    Could you please check my code below if it is relevant, I am afraid it captures all countries and not the specific relationship I mentioned above.
    1. Please could you share what should be my next steps in ensuring the validity of my results, should I drop all irrelevant observations and keep only the exports of Bulgaria and Romania towards these countries?
    2. What other regressions should I run and what other econometrical analysis could I use?

    Code below :
    *load dataset*
    use "C:\Users\themi\Desktop\master theis\DATA\Gravity_V202202.dta"
    *generate log variable*
    keep if year>= 1990
    gen lgdp_o=log(gdp_o)
    gen lgdp_d=log(gdp_d)
    gen lpop_o=log(pop_o)
    gen lpop_d=log(pop_d)
    gen ldistcap=log(distcap)
    gen ln_tradeflow_baci=log(tradeflow_baci)
    gen ltradeflow_comtrade_o=log( tradeflow_comtrade_o)
    gen ltradeflow_comtrade_d=log(tradeflow_comtrade_d)
    gen lgdgcap_o=log( gdpcap_ppp_o)
    gen lgdgcap_d=log( gdpcap_ppp_d)
    rename ltradeflow_comtrade_o exportsBulgaria_Romania
    label variable exportsBulgaria_Romania "Exports of Bulgaria and Romania"
    *create pair*
    gen exporter=(country_id_o == "BGR" | country_id_o == "ROU" )
    gen importer=(country_id_d == "DZA" | country_id_d == "AND" | country_id_d == "AZE" | country_id_d == "CHL" | country_id_d == "EGY" | country_id_d == "FRO" | country_id_d== "JOR" | country_id_d == "LBN" | country_id_d == "MEX" | country_id_d == "MAR" | country_id_d == "NOR" | country_id_d == "PSE" | country_id_d == "SMR" | country_id_d == "TUN" | country_id_d == "TUR")
    *create FE*
    egen exp_time=group(exporter year)
    tabulate exp_time, generate(EXPORTER_TIME_FE)
    egen imp_time=group(importer year)
    tabulate imp_time, generate(IMPORTER_TIME_FE)
    egen countrypair_id=group(exporter importer)
    tabulate countrypair_id, generate(PAIR_FE)
    *run PPML regressions*
    ppml exportsBulgaria_Romania rta lgdp_o lgdp_d lpop_o lpop_d ldistcap lgdgcap_o lgdgcap_d EXPORTER_TIME_FE* IMPORTER_TIME_FE* PAIR_FE*

  • #2
    Dear Themis Nedeltchev,

    Although I wrote ppml, I recommend that you use ppmlhdfe to deal with the fixed effects; make sure you check the examples in the help file. You will see that some of your variables will drop because they are collinear with the fixed effects.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Professor Silva ,

      I want to thank you for the suggestion, i ran the regression with ppmlhdfe, the only thing which worries me is should i include only the pair FE
      ppmlhdfe exportsBulgaria_Romania eu_o wto_d rta wto_o lgdp_o lgdp_d lpop_o lpop_d ldistcap lgdgcap_o lgdgcap_d,a( PAIR_FE*) or
      ppmlhdfe exportsBulgaria_Romania eu_o wto_d rta wto_o lgdp_o lgdp_d lpop_o lpop_d ldistcap lgdgcap_o lgdgcap_d,a( EXPORTER_TIME_FE* IMPORTER_TIME_FE* PAIR_FE*), the only issue with this one is that i get omitted variable for the EU dummy which is not particularly helpful in my analysis.
      Best regards.
      Themsi

      Comment


      • #4
        Dear Themis Nedeltchev,

        I suggest that you create the fixed effects inside the absorb option as in the help file. For example, you can do a(exporter#year importer#year exporter#importer).

        As for the EU dummy, I so not know what you want to do, but typically it is equal to 1 if both partners are members (and you need observations before an after the join).

        Best wishes,

        Joao

        Comment


        • #5
          Thank you for your prompt response, I ran it and it works pretty well.
          I am simply wondering, what measure would you advise to use as an independent variable of exports in my panel data analysis ,as CEPII offers :
          1. tradeflow_comtrade_o:Trade flows as reported by the origin, in 1000 current USD Source: Comtrade, bilateral. •
          2. tradeflow_comtrade_d: Trade flows as reported by the destination, in 1000 current USD. Source, Comtrade, bilateral.
          3. tradeflow_baci: Trade flow, 1000 current USD. Source: BACI, bilateral.
          Thanks again for the help.
          Best regards.
          Themis

          Comment


          • #6
            I would go for 2, but I am not an expert on that...

            Comment


            • #7

              Make sure to run the following code also: keep if exporter == 1 & importer == 1 But since you're using a gravity model framework, consider including other relevant factors like fencing, peanut butter with pasta or also popcorn, which is heavily consumed by the Latvian population (just in case you have that country in yourr case)

              Comment


              • #8
                Hello again , as i have 2 exporting countries and the population is correlated with FE , making it omitted ,do you advise me to drop it or centered ?

                Comment


                • #9
                  Just drop it, its effect is captured by the FE.

                  Comment


                  • #10
                    Thank you for the tip ,
                    I would greatly appreciate your insights on the following issues:

                    1. Log Transformation of Tariff Rates:
                    In my econometric analysis, I am considering using a logarithmic transformation of tariff rates. There is some debate in the literature on whether the log transformation should be \log(\text{tariff}) or \log(1 + \text{tariff}) . The latter approach is sometimes used to handle zero tariff rates and to normalize the distribution.
                    • Question: Should the log transformation of tariff rates be expressed as \log(\text{tariff}) or \log(1 + \text{tariff}) ? What are the theoretical and empirical justifications for each approach?
                    2. Collinearity Between Non-Tariff Barriers and Fixed Effects:
                    In my model, I am also including non-tariff barriers (NTBs) along with fixed effects (FE) for countries and time periods. However, I am concerned about potential collinearity between NTBs and the fixed effects, which might undermine the robustness of my estimates.
                    • Question: What strategies would you recommend to address collinearity between non-tariff barriers and fixed effects in a panel data setting?

                    Comment


                    • #11
                      On 1, that is an empirical question and I do not think any of those approaches are satisfactory. I would prefer to just use log, replace the missings caused by log(0) by zeros, and include in the model also a dummy for those cases. The dummy will give the effect of changing tariff from 0 to positive and log tariff will give you the effect of changes in positive tariffs.

                      On 2, I would not worry to much about that because you probably will have to live with it anyway.

                      Comment


                      • #12
                        Thank you so much, works better, I have a last question, if I have a dataset with exports and imports by industry, should I run PPML
                        .ppmlhdfe trade_e new_FTA_EU l_avragetariff, a( id year industry) or create egen _d = group ( country_id_o country_id_d industry) and then run ppmlhdfe trade_e new_FTA_EU l_avragetariff, a( id year).

                        Comment


                        • #13
                          I am not sure if I understand your notation/data, but a structural model would include origin##year##industry and destination##year##industry fixed effects. That is, the fixed effects should vary by year and industry, which is not the same as including year and industry fixed effects.

                          Comment


                          • #14
                            I understand, the issue is that I have only 2 exporting countries, which is why I have a lot of issues with collinearity with the FTA dummy, which is omitted when applying origin##year##industry and destination##year##industry fixed effects.

                            Comment


                            • #15
                              Dear Statalisters,

                              I just wanted to ask about the "ppmlhdfe" command. I am currently doing my thesis on OECD FDI inflows (and exchange rate volatilities), combining OECD FDI data and CEPII gravity data, and I wanted to know how to create time invariant country and country-pair fixed effects, as well as time fixed effects. The variables in my model are the volatility of the real effective exchange rate (VREER), the real effective exchange rate (REER) and the GDPs and GDP per capitas of the origin and destination countries. The countries are sorted by iso numbers/codes by origin and destination. Here is my example code:

                              Code:
                              egen pair_id = group(isonum_o isonum_d)
                              Code:
                              xtset pair_id year
                              Code:
                              ppmlhdfe FDIflow lnvreer lnreer lngdp_o lngdp_d lngdpcap_o lngdpcap_d, absorb(isonum_o isonum_d pair_id year) cluster(pair_id)
                              Click image for larger version

Name:	Fixed Effects Model.png
Views:	3
Size:	380.5 KB
ID:	1759210

                              When I run this code, the sign for REER does not conform to what the literature suggests (REER should be negative). However, when I remove fixed effects and include time invariant characteristics such as colonial dependence, common languages and contiguity, the regression results look more similar to the literature:

                              Code:
                              ppmlhdfe FDIflow lnvreer lnreer lngdp_o lngdp_d lngdpcap_o lngdpcap_d lndist contig comlang_off col_dep_ever, cluster(pair_id)
                              Click image for larger version

Name:	Classic Model.png
Views:	2
Size:	452.2 KB
ID:	1759211


                              My r-squared significantly decreases too. Would you have any advice or suggestions? Perhaps I have coded my fixed effects wrong. Finally, as some of my FDI flows are negative (representing a disinvestment), I am copying the Baier (2020) approach of converting "negative flow values to zero and exclude missing values, as explained in Welfens/Baier (2019)" (BREXIT and Foreign Direct Investment: Key Issues and New Empirical Findings, p.58), as PPML cannot deal with negative values. Is this the correct approach?

                              Any advice would be greatly appreciated!

                              Ronan

                              Comment

                              Working...
                              X