Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gravity model using Paneldata from CEPii

    Hi,

    This is my first time using STATA and I have a hard time getting the hang of if. I want to preform a gravity model using panel data from CEPII website.
    My data is generated like this:
    Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	17.3 KB
ID:	1367832


    What I am wondering is following: How to I organize my data so I can tell STATA it is panel data so I can add fixed effects for time, export and import? I want iso_o to be the panelvar and Year as time. Also how can I test for heteroskedasticity?

    Thank you in advance.

    Kind regards,
    two incompetent bachelor students

  • #2
    You can't have iso_o as panel identifier because (iso_o, Year) pairs occur more than once.

    I don't know what's standard for gravity models in that respect but others should be able to say more.

    Comment


    • #3
      Dear Johanna,

      From what I understand, you do not really need to define the data as being a panel; you would need to do it if you wanted to estimate a model with country-pair fixed effects.

      From your post, I guess that you just want to include origin, destination, and time dummies and for that there is no need to define a panel.

      Best wishes,

      Joao

      Comment


      • #4
        Dear Joao,

        Thank you for your help and quick response.
        Is it correct to do the following regression, if I want to include fixed effects for time, origin and destination:

        qui tab iso_o, g(_imp)
        qui tab iso_d, g(_exp)
        qui tab Year, g(_year
        reg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU _year* _exp* _imp*, robust

        Thank you in advance.

        Sincerely,
        Johanna

        Comment


        • #5
          Johanna,
          As said by Nick, you can't define a panel with bilateral trade flows (if you'd had restrict to one exporter (iso_o) or one importer (iso_d) you could have, although you should have encoded your iso variable first).

          As noted out by João, you could defined country-pair (exporter-importer specific) fixed effects.
          However, you could define a panel where the individual variable is the country pair; and the time variable the year. Then you could run a gravity model on trade flows with this panel structure, you only have to keep in mind that fixed effects wouldn't be country fixed effects, but rather exporter-importer fixed effects.
          To define such a panel follow this procedure:
          Code:
          egen countrypair=group(iso_o iso_d)
          xtset countrypair year
          
          xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU ,fe
          If you want to include the two fixed effects from each country (but not the relationship specific fixed effects), there is much easy than your code in #4 (which is most likely to be false).

          Code:
          reg  lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU i.year i.exp i.imp
          Assuming exp and imp are numerical codes for iso_o and iso_d.
          This includes dummy variables for all exporter, importer and year, which would capture all that's exporter-specific, importer specific and year specific, so basically acting like fixed effects, but without having to specify a panel structure.
          However, the output table might be hard to read with that many variables...

          Best,
          Charlie

          Comment


          • #6
            Hi Charlie,

            Thank you for the help! It was really helpful.

            I think I want to include the two fixed effects from both countries. I have been trying to create numerical codes for iso_o and iso_d so I can use the second code, however I don't think I've done it right. Should I use the gen command or the encode command?

            Thanks in advance!

            All the best,
            Johanna

            Comment


            • #7
              Use:
              Code:
              encode iso_o, gen(exp)
              encode iso_d, gen(imp)
              Here you would use egen ..= group() only to create pairs of countries (numerical) identifier (but you don't want to do that right?).
              Eeven if technically,
              Code:
              egen exp=group(iso_o)
              egen imp=group(iso_d)
              would give you the same results as the first code, it just makes less sense to use it here.

              Comment


              • #8
                Hi Charlie,

                I don't want to group the countries cause I want the country specific effects and have exp (in numerical) to be my panelvar. When I try to fix this SATA says

                . xtset exp Year
                repeated time values within panel

                Is there any way to fix this?

                Thank you!

                Comment


                • #9
                  Is this about right? I want to create dummy for the importer and year.

                  egen exp=group(iso_o)
                  egen imp=group(iso_d)
                  tabulate imp, generate(dum)
                  tabulate Year, generate(dum)

                  Comment


                  • #10
                    tabulate iso_o, gen(dum) should get you there in one. Note that you need a different prefix, not dum in both groups.

                    Comment


                    • #11
                      Nick gave you a good advice to generate the dummy variables,

                      However I don't know much how's that's be useful to solve the problem addressed in #8: to set up a panel structure with Exporter as panelvar and Year as timevar, you should only keep one observation per country-year, so loose the information on importer.
                      Otherwise you'll always have the same error message from Stata about repeated time values within panel, since for each exporter you have every single year dupplicated as many times as you have trade partners.

                      If you want to keep with the three distinct fixed effects (exporter, importer, year), see the last code in #5, which doesn't requires a panel structure to be define, nor dummy variables to be created independently (the i.prefix virtually creates them for the regression)

                      Best,
                      Charlie

                      Comment


                      • #12
                        Thank you!

                        My adviser says that we should have exporter, that is iso_o as panelvar and the created index as timevar. Then estimate using xtreg and use STATAs fixed effect and manually add dummy for importer and year (dropping the dummy for the first year and for one importer). Dose this sound reasonable?

                        I have had problem with my hausman test, it is showing:



                        Could you have a look at my code and see if something is wrong:
                        gen lgdp_o=log(gdp_o)
                        gen lgdp_d=log(gdp_d)
                        gen lpop_o=log(pop_o)
                        gen lpop_d=log(pop_d)
                        gen ldistcap=log(distcap)
                        gen lflow=log(flow)
                        regress lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU
                        vif
                        vce, corr
                        estat hettest
                        estat imtest, white
                        generate index = _n
                        tsset index
                        estat dwatson
                        egen exp=group(iso_o)
                        xtset exp
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU, fe
                        estimate store fe
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU, re
                        estimate store re
                        hausman fe re
                        findit xtserial
                        net sj 3-2 st0039
                        net install st0039
                        xtset exp index
                        xtserial lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU
                        tabulate iso_d, gen(imp)
                        tabulate Year, gen(year)
                        drop year1
                        drop imp1
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU imp* year*, fe robust
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU, fe
                        estimate store fe
                        xtreg lflow lgdp_o lgdp_d lpop_o lpop_d ldistcap border comlang_off EU, re
                        estimate store re
                        hausman fe re

                        Thanks again! you guys are HEROES

                        Comment


                        • #13
                          Johanna,
                          did you try to attached a picture of your Hausman test? We (I?) don't see it.
                          However, please take a look at the FAQ, and then for further posts please consider:
                          -not showing image of Stata output, but write (or copy paste) exactly what Stata reports. Here we don't know what's wrong with you hausman test.
                          -Use Code delimiters to post code.
                          -Use dataex to post your data (although your dataset is freely available and huge for dataex, please report a piece of it), for us to test the same code on the same data.
                          -Also write Stata and not STATA.

                          This being said, and since I cannot comment your Hausman test, I'll go back in general comments:
                          What kind of index you want to use for the timevar in panel? For the panel to work, it should at least be a combination of year and importer (using egen group() for example). But I'm very skeptical about creating such an index, then using fixed effects on this index (so in my assumption both exporter and importer/year fixed effects) and then add manually importer and year dummy variables, especially by creating for real those variables in the dataset (your data is huge enough not to add 200+ variables on 10000+ observations).

                          You should probably wait for other advices than mine, but if I were to chose, I'll pick the solution I gave you earlier rather than your alternative.

                          Best
                          Charlie
                          Ps: Yes, hereos, or Stata nerds, that's just a matter of point of view as I often say....

                          Comment


                          • #14
                            Dear @Joao Santos Silva
                            Several years later I have a question regarding this topic. I am currently working on my master thesis where I want to make use of several gravity equations. I had trouble regarding the setup of my panel data and therefore ended up finding this topic. But I am still not entirely sure what to do. Hopefully you, or someone else ofcourse, could help me.

                            First, I want to use a pooled OLS equation, it only for being a first benchmark. Secondly, an equation where I introduce time fixed effects and both country importer and export fixed effects. And finally an equation where I use time fixed effects, importer and exporter time-varying fixed effects and country-pair fixed effects.
                            I wanted to setup my paneldata by using xtset where the error; repeated time values within panel started to occur. Searching for an answer I found this old topic. Now reading your answer from several years earlier, I wondered if I could simply use reg for my first two equations while including the correct variables etc. and only use xtset countrypair year while using the last equation? Or how could I setup my panel for the first two equations to use xtreg while using fixed effects.

                            Hopefully my story is clear and you could help me somehow.

                            Best,
                            Joel


                            Comment


                            • #15
                              Dear Joel Jansema,

                              First of all, note that you should use PPML, not linear regressions. Having said that, I suggest you consider the commands reghdfe and ppmlhdfe that should make it easy to deal with all these issues.

                              Best wishes,

                              Joao

                              Comment

                              Working...
                              X