Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ainhoa Oses
    replied
    It is a bilateral model. Country fixed effects are dummies for all the origin countries that migrate to the country I am analysing.

    The struggle is that as soon as I add population variables, the corresponding coefficients of these demographic variables are really high and make the model explosive. This happens namely for the demographic variables that refer to the destination country that I'm analysing. I'm thinking there might be collinearity issues. I need to get an insight on this since it's really important for my research...Thanks again.

    Best regards,

    Ainhoa
    Last edited by Ainhoa Oses; 05 Feb 2019, 09:52.

    Leave a comment:


  • Joao Santos Silva
    replied
    Sorry, what exactly do you mean by country FE? Are these origin and destination FE or pair FE?

    Leave a comment:


  • Ainhoa Oses
    replied
    Thanks a million for that, Joao. I've dropped the lag. Just an additional point related to the model getting explosive. My X variables include country FE, GDP per capita of origin and destination, and population structures by age groups at origin and destination. When I exclude the population structure part, the model looks very reasonable, including the predictions. However, when I include them, the model becomes really unstable. Would you think dropping these population structures could be justified? In a way, GDP per capita is partly feeding from population assumptions.

    Best wishes,

    Ainhoa

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Ainhoa,

    If I understand it correctly, you are explaining the stock of migrants by the stock in the previous period. Because the stock of migrants is likely to vary slowly, you are essentially using something to explain itself. Also, I do not know what kind of fixed effects are using but these are likely to be very collinear with the lagged stock, and this may make the model very unstable.

    Best wishes,

    Joao

    Leave a comment:


  • Ainhoa Oses
    replied
    Dear Joao,

    Could you give me some more insight on why you think the model is strange? If you could give me some advice on a specification that would make more sense, I would be really grateful.

    Cheers,

    Ainhoa

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Ainhoa,

    I am afraid I have no suggestions, but I still think that your model is very strange and so I am not surprised by the strange results.

    Best wishes,

    Joao

    Leave a comment:


  • Ainhoa Oses
    replied
    Hi Joao,

    Many thanks for all your help. Using the model I stated above (stocks of migrants as a function of their lag, plus other demographic/economic variables that I also included, and Fixed Effects), the predictions get really explosive in general. A small addition/subtraction of variables in the model result in non-sense (i.e., unrealisticly too high) predictions. Would you have any rationale for this? I noticed that both the constant and the country dummies get very high coefficients as compared to those for time-varying variables. I know the question is rather general, but there might be something obvious that I'm getting wrong.

    Thanks again,

    Ainhoa

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Ainhoa,

    If you want to include the lag, it makes sense to log it. Myquestion is whether it makes sense to include thelag; I guess the answer depends on the purpose of the model.

    Best wishes,

    Joao

    Leave a comment:


  • Ainhoa Oses
    replied
    Hi Joao,

    Just to confirm this specification is correct through ppml:

    Code:
    ppml Mig LOGMig(-1) DUM_COUNTRY*
    Where Mig is the stock of Migrants from different origins at a given country for different periods of time. This depends on their lag and FE are included (DUM_COUNTRY*).
    My only question is whether the lag of the dependent variable (LOGMig(-1)) should indeed be in LOGs or not.

    Many thanks,

    Ainhoa

    Leave a comment:


  • Isabel Cour
    replied
    Dear Joao,

    First at all thanks for your help. Second, apologize if I was not enough clearly in my message.
    Concern to your second comment, gdp exporter and importer were already rescaled, i did not understand why the WARNING message when the coefficient are already small, I was using log of millions of dollars.
    I understood why these variables are dropped, perhaps in this case happens because I am using an small database with few countries, but the specification model is correct in this research question. You made a good point and thank ver much again for remember it. I will try to add other control variables.

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Isabel,

    I am not sure to have understood all your questions, but here is my attempt to help:

    1 - The fact that you do not have zeros does not make it OK to use OLS in logs; indeed, the zeros are just a very minor problem. Therefore, I expect that OLS and PPML results to be very different and, of course, the PPML results are much more reliable.

    2 - You do not have to rescale the variables, but you can do it. For example, instead of using log of GDP in thousands of dollars, you can use log of GDP in millions of dollars. If the estimator converges, there is no need to worry about this.

    3 - If you only have one importer, distance and exporter GDP will be collinear with the exporter fixed effects and need to drop; the same happens if you use OLS. There may be other variables being dropped by the same reason, again just like in OLS. You need to think carefully about what you are doing because you risk interpreting coefficients that are meaningless.

    Best wishes,

    Joao

    Leave a comment:


  • Isabel Cour
    replied
    Dear Joao,

    I am running a pooled OLS, and I want to check robustness so I use PPML method. Total import or export flows do not have any zero but trade by sector level. When I use the total flow of import I got a WARNING to rescale lngdp, two independent variable lngdp_exporter(6 countries) lngdp_importer(1 country). how can I rescale such small coefficient?
    I rescaled then I used again PPML, result shows distance, one dummy expo and one dummy import gdpimporter, and two dummy year are dropped. Despite is a control variables, gdp_importer is part of the research question as distance why are dropped?
    Here, I copy a sample of the data,

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double lnimpo float(lnintus lngdp1 lngdp2 lndist) byte(exporter_1 exporter_2 exporter_3 exporter_4 exporter_5 exporter_6 importer_1)
     20.65069580078125  .5743148  26.37296 27.817717 9.857967 1 0 0 0 0 0 1
    20.970928192138672  .9706464  26.31685 27.917883 9.857967 1 0 0 0 0 0 1
    20.937944412231445  1.525122  25.34863 28.010767 9.857967 1 0 0 0 0 0 1
     21.72722816467285 1.8245493 25.587696  28.13175 9.857967 1 0 0 0 0 0 1
    21.903419494628906 1.9878744 25.934366  28.29461 9.857967 1 0 0 0 0 0 1
    end
    Please any suggestions is very welcome, thanks in advance. Kind Regards

    Leave a comment:


  • Joao Santos Silva
    replied
    Dear Felipe,

    First of all, forget the FGLS estimation because that is simply inadequate.

    About your model, I think you should use clustered standard errors. Also, your sample is rather small, but maybe you could try to include the usual "fixed effects".

    Best wishes,

    Joao

    Leave a comment:


  • JFelipe PinedaG
    replied
    Hi Joao , i wonder if you may help with some doubts that i have with an intra regional gravity model.

    i have a panel with 4 periods and my dependent variable is the total kilograms trade.

    This is the Stata do code and results.

    ppml L_KL_TOTALES_deptos L_PIBtotal2016pr_origen L_PIBtotal2016pr_destino L_Distancia_geodésica L_remoteness_origen L_remoteness_destino frontera_pais_origen Zonas_francas_destino puert
    > o_marítimo_destino puerto_marítimo_origen d_frontera_depto Zonas_francas_origen

    note: checking the existence of the estimates
    WARNING: Zonas_francas_destino has very large values, consider rescaling or recentering
    WARNING: Zonas_francas_origen has very large values, consider rescaling or recentering

    Number of regressors excluded to ensure that the estimates exist: 0
    Number of observations excluded: 0

    note: starting ppml estimation
    note: L_KL_TOTALES_deptos has noninteger values

    Iteration 1: deviance = 400.8257
    Iteration 2: deviance = 400.2885
    Iteration 3: deviance = 400.2885
    Iteration 4: deviance = 400.2885

    Number of parameters: 12
    Number of observations: 2648
    Pseudo log-likelihood: -6217.08
    R-squared: .66772627
    Option strict is: off
    ------------------------------------------------------------------------------------------
    | Robust
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .0666277 .0027448 24.27 0.000 .0612479 .0720074
    L_PIBtotal2016pr_destino | .0717089 .0025906 27.68 0.000 .0666313 .0767865
    L_Distancia_geodésica | -.0539089 .0044969 -11.99 0.000 -.0627226 -.0450952
    L_remoteness_origen | -.0295255 .0104363 -2.83 0.005 -.0499803 -.0090707
    L_remoteness_destino | .0401392 .0090682 4.43 0.000 .0223658 .0579126
    frontera_pais_origen | .0257149 .0067418 3.81 0.000 .0125013 .0389285
    Zonas_francas_destino | .0025479 .0004794 5.31 0.000 .0016083 .0034875
    puerto_marítimo_destino | .0307702 .0074366 4.14 0.000 .0161947 .0453457
    puerto_marítimo_origen | .0699515 .0074102 9.44 0.000 .0554277 .0844752
    d_frontera_depto | .0413737 .0064782 6.39 0.000 .0286767 .0540707
    Zonas_francas_origen | .0055721 .0004222 13.20 0.000 .0047445 .0063997
    _cons | 1.539765 .1010227 15.24 0.000 1.341764 1.737766
    ------------------------------------------------------------------------------------------


    RESET TEST



    . predict u, xb

    . gen u2 = u^2

    . ppml L_KL_TOTALES_deptos L_PIBtotal2016pr_origen L_PIBtotal2016pr_destino L_Distancia_geodésica L_remoteness_origen L_remoteness_destino frontera_pais_origen Zonas_francas_destino puert
    > o_marítimo_destino puerto_marítimo_origen d_frontera_depto Zonas_francas_origen u2

    note: checking the existence of the estimates
    WARNING: Zonas_francas_destino has very large values, consider rescaling or recentering
    WARNING: Zonas_francas_origen has very large values, consider rescaling or recentering

    Number of regressors excluded to ensure that the estimates exist: 0
    Number of observations excluded: 0

    note: starting ppml estimation
    note: L_KL_TOTALES_deptos has noninteger values

    Iteration 1: deviance = 392.748
    Iteration 2: deviance = 391.5719
    Iteration 3: deviance = 391.5718
    Iteration 4: deviance = 391.5718

    Number of parameters: 13
    Number of observations: 2648
    Pseudo log-likelihood: -6212.7217
    R-squared: .67972867
    Option strict is: off
    ------------------------------------------------------------------------------------------
    | Robust
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .2947494 .0355748 8.29 0.000 .225024 .3644748
    L_PIBtotal2016pr_destino | .3167968 .0382707 8.28 0.000 .2417876 .3918059
    L_Distancia_geodésica | -.2412124 .0296801 -8.13 0.000 -.2993844 -.1830404
    L_remoteness_origen | -.1296204 .0187523 -6.91 0.000 -.1663742 -.0928666
    L_remoteness_destino | .1754634 .0233051 7.53 0.000 .1297862 .2211406
    frontera_pais_origen | .1134619 .0151337 7.50 0.000 .0838005 .1431233
    Zonas_francas_destino | .0113557 .0013858 8.19 0.000 .0086395 .0140718
    puerto_marítimo_destino | .1392252 .0177827 7.83 0.000 .1043718 .1740786
    puerto_marítimo_origen | .3095705 .0379202 8.16 0.000 .2352483 .3838927
    d_frontera_depto | .1859924 .0219579 8.47 0.000 .1429558 .229029
    Zonas_francas_origen | .025091 .0029905 8.39 0.000 .0192297 .0309524
    u2 | -.6285101 .0954756 -6.58 0.000 -.8156388 -.4413813
    _cons | 2.184017 .128346 17.02 0.000 1.932464 2.435571
    ------------------------------------------------------------------------------------------


    -----------------------------------------------------------------
    RESULTS OF FGLS ESTIMATOR


    Cross-sectional time-series FGLS regression

    Coefficients: generalized least squares
    Panels: heteroskedastic
    Correlation: no autocorrelation

    Estimated covariances = 662 Number of obs = 2,648
    Estimated autocorrelations = 0 Number of groups = 662
    Estimated coefficients = 12 Time periods = 4
    Wald chi2(11) = 158041.37
    Log likelihood = -3046.99 Prob > chi2 = 0.0000

    ------------------------------------------------------------------------------------------
    L_KL_TOTALES_deptos | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    L_PIBtotal2016pr_origen | .9365632 .0061036 153.45 0.000 .9246004 .9485259
    L_PIBtotal2016pr_destino | 1.144922 .006901 165.91 0.000 1.131396 1.158448
    L_Distancia_geodésica | -.9195397 .0123681 -74.35 0.000 -.9437806 -.8952987
    L_remoteness_origen | -.5537694 .0335578 -16.50 0.000 -.6195415 -.4879974
    L_remoteness_destino | 1.235415 .0236238 52.30 0.000 1.189113 1.281716
    frontera_pais_origen | .4298494 .0238932 17.99 0.000 .3830196 .4766793
    Zonas_francas_destino | .0321848 .0012136 26.52 0.000 .0298061 .0345635
    puerto_marítimo_destino | .2281944 .0180926 12.61 0.000 .1927335 .2636553
    puerto_marítimo_origen | 1.168645 .0224582 52.04 0.000 1.124627 1.212662
    d_frontera_depto | .2722579 .0166337 16.37 0.000 .2396565 .3048593
    Zonas_francas_origen | .0863585 .0006899 125.18 0.000 .0850063 .0877106
    _cons | -4.642567 .3106725 -14.94 0.000 -5.251474 -4.033661
    ---------------------------------------------------------------------------------------



    ---------------------------------------------------------------------------------

    Im worried about the fact that the RESET test its being rejected, should i use the FGLS estimator instead?. What do you think about the performance of that estimator?

    Thank u very much.

    Felipe


    Leave a comment:


  • JJ vdB
    replied
    Dear Joao and Dias,

    Thank you for your helpful and quick advice, I appreciate it.

    I estimated some small subsets of my dataset using the suggestion by Dias (poi2hdfe) and the ppml command, but the ppml command was faster so I am running now regressions with different subsets (where the subsets start with only 1 industry and the last subset contains all 14 industries for intermediate input trade). I did not run regressions using the ppml_panel_sg command because I believe with cross-sectional data I only need to include exporter and importer fixed effects.

    Kind regards,


    Joost.

    Leave a comment:

Working...
X