Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PPMLHDFE absorb function

    Dear all,

    I am estimating a gravity equation with data on country pair, sector, year level. In my regression on the traditional gravity variables (log of origin GDP, log of host GDP, log of the sum of GDPS and log of distance), I add the interaction between sector dummies (I have 25 sectors) with the gravity variables. I further add year and origin country*sector as well as destination country*sector fixed effects. My question regards the inclusion of the fixed effects via the absorb() command. Would it be correct to do it like this

    Code:
    local gravity_sectorlevel lngdp_o lngdp_d lndistw lnsumgdp comcol col45 comlang_off lnsmp_dest naics2_1-naics2_24 lndistw_* lngdp_o_* lngdp_d_* lnsumgdp_*
    ppmlhdfe TotalassetsthUSD `gravity_sectorlevel', absorb(year country_origin_sector_encode country_dest_sector_encode) cluster(country_pair_encode)
    where the variable country_origin_sector_encode takes on e.g. the value "AGO52" if Angola invested in another country in the sector 52, and the variable country_dest_sector_encode takes on e.g. the value "BRA52" if it has been invested in Brazil by another country in the sector 52? With this, the ppmlhdfe regression identified 1069 categories for the origin varaible and 1071 categories for the destination variable. With codebook country_origin_sector_encode I get 1,620 unique values and with codebook country_dest_sector_encode, I get 1,585 unique values.

    Alternatively, I tried the following:

    Code:
    local gravity_sectorlevel lngdp_o lngdp_d lndistw lnsumgdp comcol col45 comlang_off lnsmp_dest naics2_1-naics2_24 lndistw_* lngdp_o_* lngdp_d_* lnsumgdp_*
    ppmlhdfe TotalassetsthUSD `gravity_sectorlevel', absorb(year iso3_o_encode#naics2 iso3_d_encode#naics2) cluster(country_pair_encode)
    which produces the same results. iso3_o is the iso code for origin country, iso 3 for destination country and naics2 takes on the sector numbers.

    However, if I add the factor notation:

    Code:
    local gravity_sectorlevel lngdp_o lngdp_d lndistw lnsumgdp comcol col45 comlang_off lnsmp_dest naics2_1-naics2_24 lndistw_* //nicht
    ppmlhdfe TotalassetsthUSD `gravity_sectorlevel', absorb(year i.iso3_o_encode#c.naics2 i.iso3_d_encode#c.naics2) cluster(country_pair_encode)
    I get completely different coefficients, only 113 categories for i.iso3_o_encode#c.naics2 and 116 categories for i.iso3_d_encode#c.naics2, as well as missing values for all of the standard errors and confidence intervals and p values in the regression output.

    What would be the appropriate way to include my fixed effects in the ppmlhdfe command?

    Thank you in advance for any help!

    Best,
    Noemi

  • #2
    In the last case, you are treating some of the variables as continuous, which is not what you want to do. Please check the help file and also try to learn more about factor variables.

    Best wishes,

    Joao

    Comment


    • #3
      Dear Joao,

      thank you for the clarification, that was very helpful!

      Best
      Noemi

      Comment


      • #4
        Dear all,
        I am working on bilateral interprovincial migration flows at the NUT3 level for a sample of European countries, and I am using ppmlhdfe (Correia et al, 2020). Along with standard variables explaining bilateral flows (per capita GDP at origin and destination, unemployment rate at origin and destination, population at origin and destination), I also introduce origin#year and destination fixed effects to cope with multilateral resistance to migration (Bertoli and Fernandez-Huertas Moraga, 2016; Ortega and Peri, 2013). As a post-estimation analysis, I check for the presence of cross-sectional dependence with xtcd2 and, indeed, I find that there is cross-sectional dependence.

        I have two questions:
        1) is the presence of cross-sectional dependence a problem in this setting?
        2) is there any way of coping with the presence of residual cross-sectional dependence when using ppmlhdfe? In this regard, I found an unpublished paper by Desbordes and Eberhardt, 2019 that uses an estimator (they refer to it as CCE-PPML). Unfortunately it is not clear (at least it is not clear to me) how such an estimator could be implemented.

        Best,
        Romano

        Comment

        Working...
        X