Hi all,
I am a master student and I am trying to estimate a gravity model but I think I have some problems with that. I have a database with 1,663,200 observations referring to the exports and imports divided by 120 products of 22 regions in Italy with 35 countries for the period 19952012 . I want to include fixed effect of origin, products, time and destination.
For this I have created dummy variables for each of these using:
xi i.region*i.year, prefix(O*)
xi i.product*i.year, prefix(P*)
xi i.country*i.year, prefix(C*)
At the end of this process I have nearly 4,000 variables. Now I am trying to use ppml including first separated fixed effects for each region, product, year and country and then the combined fixed effect.
ppml dep. var. indep. variables(n°14) dummy variables for region, product, country and year, cluster(indicators of each triple link of region, country and product)
The problem is that I have started the process yesterday with the first step and Stata is still elaborating iteration (at the moment n°245). Is this normal for the dimensions of the database or there is a problem with the formulation of the command or my estimation of the model?
Thanks everybody
Announcement
Collapse
No announcement yet.
X

Gravity model with ppml command
Tags: None

Not ideal, but possibly adequate because I imagine that the rate is always very far from 1.
Leave a comment:

Thanks Joao. Just a fundamental question: can PPML be used when the dependent variable is a rate? I'm using it for a gravity model where the dependent variable is the flow of migrants divided by the lagged population of origin.
Thanks again.
Ainhoa
Leave a comment:

If I understand you correctly, the country dummies are highly collinear with the country characteristics and with the lagged migration, so this is likely to result in a very unstable model. If your country variables do not vary over time they are perfectly collinear with the fixed effects and should be excluded; make sure you are not falling into that trap.
Best wishes,
Joao
Leave a comment:

My fear is that these fixed effects have really high coefficients compared to the rest of the timevarying variables of my model. This might be contributing to the explosive predictions. Would you have any suggestions on how to tackle this?
I really appreciate your help!
Leave a comment:

It is a bilateral model. Country fixed effects are dummies for all the origin countries that migrate to the country I am analysing.
The struggle is that as soon as I add population variables, the corresponding coefficients of these demographic variables are really high and make the model explosive. This happens namely for the demographic variables that refer to the destination country that I'm analysing. I'm thinking there might be collinearity issues. I need to get an insight on this since it's really important for my research...Thanks again.
Best regards,
AinhoaLast edited by OA Stata; 05 Feb 2019, 09:52.
Leave a comment:

Sorry, what exactly do you mean by country FE? Are these origin and destination FE or pair FE?
Leave a comment:

Thanks a million for that, Joao. I've dropped the lag. Just an additional point related to the model getting explosive. My X variables include country FE, GDP per capita of origin and destination, and population structures by age groups at origin and destination. When I exclude the population structure part, the model looks very reasonable, including the predictions. However, when I include them, the model becomes really unstable. Would you think dropping these population structures could be justified? In a way, GDP per capita is partly feeding from population assumptions.
Best wishes,
Ainhoa
Leave a comment:

Dear Ainhoa,
If I understand it correctly, you are explaining the stock of migrants by the stock in the previous period. Because the stock of migrants is likely to vary slowly, you are essentially using something to explain itself. Also, I do not know what kind of fixed effects are using but these are likely to be very collinear with the lagged stock, and this may make the model very unstable.
Best wishes,
Joao
Leave a comment:

Dear Joao,
Could you give me some more insight on why you think the model is strange? If you could give me some advice on a specification that would make more sense, I would be really grateful.
Cheers,
Ainhoa
Leave a comment:

Dear Ainhoa,
I am afraid I have no suggestions, but I still think that your model is very strange and so I am not surprised by the strange results.
Best wishes,
Joao
Leave a comment:

Hi Joao,
Many thanks for all your help. Using the model I stated above (stocks of migrants as a function of their lag, plus other demographic/economic variables that I also included, and Fixed Effects), the predictions get really explosive in general. A small addition/subtraction of variables in the model result in nonsense (i.e., unrealisticly too high) predictions. Would you have any rationale for this? I noticed that both the constant and the country dummies get very high coefficients as compared to those for timevarying variables. I know the question is rather general, but there might be something obvious that I'm getting wrong.
Thanks again,
Ainhoa
Leave a comment:

Dear Ainhoa,
If you want to include the lag, it makes sense to log it. Myquestion is whether it makes sense to include thelag; I guess the answer depends on the purpose of the model.
Best wishes,
Joao
Leave a comment:

Hi Joao,
Just to confirm this specification is correct through ppml:
Code:ppml Mig LOGMig(1) DUM_COUNTRY*
My only question is whether the lag of the dependent variable (LOGMig(1)) should indeed be in LOGs or not.
Many thanks,
Ainhoa
Leave a comment:
Leave a comment: