Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Isabel Cour
    replied
    Dear Joao,

    First at all thanks for your help. Second, apologize if I was not enough clearly in my message.
    Concern to your second comment, gdp exporter and importer were already rescaled, i did not understand why the WARNING message when the coefficient are already small, I was using log of millions of dollars.
    I understood why these variables are dropped, perhaps in this case happens because I am using an small database with few countries, but the specification model is correct in this research question. You made a good point and thank ver much again for remember it. I will try to add other control variables.

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Isabel,

    I am not sure to have understood all your questions, but here is my attempt to help:

    1 - The fact that you do not have zeros does not make it OK to use OLS in logs; indeed, the zeros are just a very minor problem. Therefore, I expect that OLS and PPML results to be very different and, of course, the PPML results are much more reliable.

    2 - You do not have to rescale the variables, but you can do it. For example, instead of using log of GDP in thousands of dollars, you can use log of GDP in millions of dollars. If the estimator converges, there is no need to worry about this.

    3 - If you only have one importer, distance and exporter GDP will be collinear with the exporter fixed effects and need to drop; the same happens if you use OLS. There may be other variables being dropped by the same reason, again just like in OLS. You need to think carefully about what you are doing because you risk interpreting coefficients that are meaningless.

    Best wishes,

    Joao

    Leave a comment:


  • Isabel Cour
    replied
    Dear Joao,

    I am running a pooled OLS, and I want to check robustness so I use PPML method. Total import or export flows do not have any zero but trade by sector level. When I use the total flow of import I got a WARNING to rescale lngdp, two independent variable lngdp_exporter(6 countries) lngdp_importer(1 country). how can I rescale such small coefficient?
    I rescaled then I used again PPML, result shows distance, one dummy expo and one dummy import gdpimporter, and two dummy year are dropped. Despite is a control variables, gdp_importer is part of the research question as distance why are dropped?
    Here, I copy a sample of the data,

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double lnimpo float(lnintus lngdp1 lngdp2 lndist) byte(exporter_1 exporter_2 exporter_3 exporter_4 exporter_5 exporter_6 importer_1)
     20.65069580078125  .5743148  26.37296 27.817717 9.857967 1 0 0 0 0 0 1
    20.970928192138672  .9706464  26.31685 27.917883 9.857967 1 0 0 0 0 0 1
    20.937944412231445  1.525122  25.34863 28.010767 9.857967 1 0 0 0 0 0 1
     21.72722816467285 1.8245493 25.587696  28.13175 9.857967 1 0 0 0 0 0 1
    21.903419494628906 1.9878744 25.934366  28.29461 9.857967 1 0 0 0 0 0 1
    end
    Please any suggestions is very welcome, thanks in advance. Kind Regards

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Felipe,

    First of all, forget the FGLS estimation because that is simply inadequate.

    About your model, I think you should use clustered standard errors. Also, your sample is rather small, but maybe you could try to include the usual "fixed effects".

    Best wishes,

    Joao

    Leave a comment:


  • JFelipe PinedaG
    replied
    Hi Joao , i wonder if you may help with some doubts that i have with an intra regional gravity model.

    i have a panel with 4 periods and my dependent variable is the total kilograms trade.

    This is the Stata do code and results.

    ppml L_KL_TOTALES_deptos L_PIBtotal2016pr_origen L_PIBtotal2016pr_destino L_Distancia_geodésica L_remoteness_origen L_remoteness_destino frontera_pais_origen Zonas_francas_destino puert
    > o_marítimo_destino puerto_marítimo_origen d_frontera_depto Zonas_francas_origen

    note: checking the existence of the estimates
    WARNING: Zonas_francas_destino has very large values, consider rescaling or recentering
    WARNING: Zonas_francas_origen has very large values, consider rescaling or recentering

    Number of regressors excluded to ensure that the estimates exist: 0
    Number of observations excluded: 0

    note: starting ppml estimation
    note: L_KL_TOTALES_deptos has noninteger values

    Iteration 1: deviance = 400.8257
    Iteration 2: deviance = 400.2885
    Iteration 3: deviance = 400.2885
    Iteration 4: deviance = 400.2885

    Number of parameters: 12
    Number of observations: 2648
    Pseudo log-likelihood: -6217.08
    R-squared: .66772627
    Option strict is: off
    ------------------------------------------------------------------------------------------
    | Robust
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .0666277 .0027448 24.27 0.000 .0612479 .0720074
    L_PIBtotal2016pr_destino | .0717089 .0025906 27.68 0.000 .0666313 .0767865
    L_Distancia_geodésica | -.0539089 .0044969 -11.99 0.000 -.0627226 -.0450952
    L_remoteness_origen | -.0295255 .0104363 -2.83 0.005 -.0499803 -.0090707
    L_remoteness_destino | .0401392 .0090682 4.43 0.000 .0223658 .0579126
    frontera_pais_origen | .0257149 .0067418 3.81 0.000 .0125013 .0389285
    Zonas_francas_destino | .0025479 .0004794 5.31 0.000 .0016083 .0034875
    puerto_marítimo_destino | .0307702 .0074366 4.14 0.000 .0161947 .0453457
    puerto_marítimo_origen | .0699515 .0074102 9.44 0.000 .0554277 .0844752
    d_frontera_depto | .0413737 .0064782 6.39 0.000 .0286767 .0540707
    Zonas_francas_origen | .0055721 .0004222 13.20 0.000 .0047445 .0063997
    _cons | 1.539765 .1010227 15.24 0.000 1.341764 1.737766
    ------------------------------------------------------------------------------------------


    RESET TEST



    . predict u, xb

    . gen u2 = u^2

    . ppml L_KL_TOTALES_deptos L_PIBtotal2016pr_origen L_PIBtotal2016pr_destino L_Distancia_geodésica L_remoteness_origen L_remoteness_destino frontera_pais_origen Zonas_francas_destino puert
    > o_marítimo_destino puerto_marítimo_origen d_frontera_depto Zonas_francas_origen u2

    note: checking the existence of the estimates
    WARNING: Zonas_francas_destino has very large values, consider rescaling or recentering
    WARNING: Zonas_francas_origen has very large values, consider rescaling or recentering

    Number of regressors excluded to ensure that the estimates exist: 0
    Number of observations excluded: 0

    note: starting ppml estimation
    note: L_KL_TOTALES_deptos has noninteger values

    Iteration 1: deviance = 392.748
    Iteration 2: deviance = 391.5719
    Iteration 3: deviance = 391.5718
    Iteration 4: deviance = 391.5718

    Number of parameters: 13
    Number of observations: 2648
    Pseudo log-likelihood: -6212.7217
    R-squared: .67972867
    Option strict is: off
    ------------------------------------------------------------------------------------------
    | Robust
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .2947494 .0355748 8.29 0.000 .225024 .3644748
    L_PIBtotal2016pr_destino | .3167968 .0382707 8.28 0.000 .2417876 .3918059
    L_Distancia_geodésica | -.2412124 .0296801 -8.13 0.000 -.2993844 -.1830404
    L_remoteness_origen | -.1296204 .0187523 -6.91 0.000 -.1663742 -.0928666
    L_remoteness_destino | .1754634 .0233051 7.53 0.000 .1297862 .2211406
    frontera_pais_origen | .1134619 .0151337 7.50 0.000 .0838005 .1431233
    Zonas_francas_destino | .0113557 .0013858 8.19 0.000 .0086395 .0140718
    puerto_marítimo_destino | .1392252 .0177827 7.83 0.000 .1043718 .1740786
    puerto_marítimo_origen | .3095705 .0379202 8.16 0.000 .2352483 .3838927
    d_frontera_depto | .1859924 .0219579 8.47 0.000 .1429558 .229029
    Zonas_francas_origen | .025091 .0029905 8.39 0.000 .0192297 .0309524
    u2 | -.6285101 .0954756 -6.58 0.000 -.8156388 -.4413813
    _cons | 2.184017 .128346 17.02 0.000 1.932464 2.435571
    ------------------------------------------------------------------------------------------


    -----------------------------------------------------------------
    RESULTS OF FGLS ESTIMATOR


    Cross-sectional time-series FGLS regression

    Coefficients: generalized least squares
    Panels: heteroskedastic
    Correlation: no autocorrelation

    Estimated covariances = 662 Number of obs = 2,648
    Estimated autocorrelations = 0 Number of groups = 662
    Estimated coefficients = 12 Time periods = 4
    Wald chi2(11) = 158041.37
    Log likelihood = -3046.99 Prob > chi2 = 0.0000

    ------------------------------------------------------------------------------------------
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .9365632 .0061036 153.45 0.000 .9246004 .9485259
    L_PIBtotal2016pr_destino | 1.144922 .006901 165.91 0.000 1.131396 1.158448
    L_Distancia_geodésica | -.9195397 .0123681 -74.35 0.000 -.9437806 -.8952987
    L_remoteness_origen | -.5537694 .0335578 -16.50 0.000 -.6195415 -.4879974
    L_remoteness_destino | 1.235415 .0236238 52.30 0.000 1.189113 1.281716
    frontera_pais_origen | .4298494 .0238932 17.99 0.000 .3830196 .4766793
    Zonas_francas_destino | .0321848 .0012136 26.52 0.000 .0298061 .0345635
    puerto_marítimo_destino | .2281944 .0180926 12.61 0.000 .1927335 .2636553
    puerto_marítimo_origen | 1.168645 .0224582 52.04 0.000 1.124627 1.212662
    d_frontera_depto | .2722579 .0166337 16.37 0.000 .2396565 .3048593
    Zonas_francas_origen | .0863585 .0006899 125.18 0.000 .0850063 .0877106
    _cons | -4.642567 .3106725 -14.94 0.000 -5.251474 -4.033661
    ---------------------------------------------------------------------------------------



    ---------------------------------------------------------------------------------

    Im worried about the fact that the RESET test its being rejected, should i use the FGLS estimator instead?. What do you think about the performance of that estimator?

    Thank u very much.

    Felipe


    Leave a comment:


  • JJ vdB
    replied
    Dear Joao and Dias,

    Thank you for your helpful and quick advice, I appreciate it.

    I estimated some small subsets of my dataset using the suggestion by Dias (poi2hdfe) and the ppml command, but the ppml command was faster so I am running now regressions with different subsets (where the subsets start with only 1 industry and the last subset contains all 14 industries for intermediate input trade). I did not run regressions using the ppml_panel_sg command because I believe with cross-sectional data I only need to include exporter and importer fixed effects.

    Kind regards,


    Joost.

    Leave a comment:


  • Said Jafar
    replied
    Dear Joost,

    Just would like to add small thing to Mr Joao's excellent advice.

    ppml_panel_sg does not allow only importer and exporter fixed effects in the model. The smallest fixed effects it can do is importer-time and exporter-time. This will drop all of your time-variant variables, including output.

    But you can use another command written for the same purpose, poi2hdfe, as mentioned in the Log of Gravity webpage. Type ssc install poi2hdfe.

    Best,
    Dias

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Joost,

    ppml will struggle to deal with such massive number of dummies; you will need a very fast processor and a lot of memory to be able to do it, assuming that you do not go beyond Stata's limits. For these cases I suggest you try ppml_panel_sg (avaliable form SSC), which should be much faster and also checks for the existence of the estimates. I recommend that you start with a small data set to make sure you get the same results with both commands.

    About the problem with the OLS results, I prefer not to comment on that because the results are not reliable anyway.

    Best wishes,

    Joao

    Leave a comment:


  • JJ vdB
    replied
    Dear Mr Santos Silva,

    I have two questions about the PPML estimator and STATA. I make use of an Input Output table with trade data for the European Union at NUTSII level (249 regions), 14 different industries, 5 different final demand categories (about 14 million observations). My aim is to estimate the border effect for the whole dataset, intermediate input trade and final goods trade. My preferred estimation method is PPML. Furthermore, as a robustness check I run OLS and GPML. For all estimations I include origin and destination fixed effects. The commands for the whole dataset are:

    tab(Exporting), gen(Exporting_)
    tab(Importing), gen(Importing_)

    reg lnTrade lnGDP_EX lnGDP_IM lnDistance_Head Home Exporting_* Importing_*, robust cluster(Distance_Head)
    reg lnTrade lnGO_EX lnGO_IM lnDistance_Head Home Exporting_* Importing_*, robust cluster(Distance_Head)
    ppml Trade lnGDP_EX lnGDP_IM lnDistance_Head Home Exporting_* Importing_*, robust cluster(lnDistance_Head)
    ppml Trade lnGO_EX lnGO_IM lnDistance_Head Home Exporting_* Importing_*, robust cluster(lnDistance_Head)
    glm Trade lnGDP_EX lnGDP_IM lnDistance_Head Home Exporting_* Importing_*, family(poisson) link(log) robust cluster(Distance_Head)
    glm Trade lnGO_EX lnGO_IM lnDistance_Head Home Exporting_* Importing_*, family(poisson) link(log) robust cluster(Distance_Head)

    When I run OLS with fixed effects I get the results within about an hour / two hours. However, when running the PPML estimation STATA only says "note: checking the existence of the estimates" but after many hours I haven't received any further output. When running PPML with a smaller subset (about 59 thousand observations) I also do not receive any output. Estimating PPML without fixed effects does provide me with results (after a few hours).

    My questions are: [1] do you have any experience with estimating PPML (including origin and destination fixed effects) for such a large database and can I expect any results within a reasonable amount of time? Or should I search for a computer with more mathematical power? [2] In most cases I receive results in line with earlier studies and as expected. However, when I run OLS with fixed effects for a subset of the database (intermediate input trade) the mass variable for the exporting region is insignificant. The exporting mass variable is measured as gross sales for the whole region. When I run the same estimation but measuring the exporting mass variable as gross sales per region and industry the coefficient is significant and in line with my expectations. Do you have any suggestion for an explanation? I randomly checked several observations in STATA and they are all fine and the same as prior to importing.

    Kind Regards,


    Joost.

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Flora,

    I think that you can use something like
    Code:
    predict yhat if e(sample)==1
    Best wishes,

    Joao

    Leave a comment:


  • Flora Panna Biro
    replied
    Dear Joao,

    I have written you a few comments and month before regarding a research I was doing then. I am now developing the paper. In short I am studying the effect of governance indicators on inward FDI in Latin-Amerika, for 12 years, 18 target and 29 source countries. I used gravity model with PPML and pair and year fixed effect, my results are consistent and accepted by the RESET test.

    After estimating the model it excluded 2460 observations out of 6217. (I am aware of the problem of the non-existence ML estimates using PPML.) I got a couple of questions regarding my results, just to make sure that I could understand well the process. My problem is that when I used the command „predict yhat”, I got the predicted values for each observation, even for the ones that have been previously excluded. Does the command „predict yhat” apply the fitted model to calculate the predicted values?
    Is there any conveniente tool (command) to identify the observations that have been excluded?

    Thank you!

    Regards,
    Flóra

    Leave a comment:


  • Muhammad Moiz
    replied
    Thank you.

    Regards,
    Muhammad Moiz

    Leave a comment:


  • Joao Santos Silva
    replied
    PPML does not work with negative numbers, but neither does OLS in logs because you cannot take logs of negative numbers. Sorry, I cannot help much here.

    Best wishes,

    Joao

    Leave a comment:


  • Muhammad Moiz
    replied
    Dear Joao,

    Thank you. I am trying to run the PPML but having a problem because some of the FDI figures are negative. How can I solve this problem?

    Regards,
    Muhammad Moiz

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Muhammad,

    The problem with the OLS estimation of the log model is not the zeros but the fact that the non-linear transformation generally leads to an inconsistent estimator. So, I would still recommend PPML, and it is as easy to use as OLS.

    Best wishes,

    Joao

    Leave a comment:

Working...
X