Variable // Missing // Total // Percent Missing
----------------+-----------------------------------------------
exchange // 243,242 // 518,480 // 46.91
colonial // 838 // 518,480 // 0.16
rivalry_code // 113,166 // 518,480 // 21.83
alliance // 38,283 // 518,480 // 7.38
diplomacy // 445,518 // 518,480 // 85.93
no. of borders // 718 // 518,480 // 0.14
contiguity // 262,538 // 518,480 // 50.64
religion // 408,850 // 518,480 // 78.86
conflict // 3,220 // 518,480 // 0.62
signatory // 1,940 // 518,480 // 0.37
election // 89,573 // 518,480 // 17.28
polity // 93,543 // 518,480 // 18.04
GDP // 42,506 // 518,480 // 8.20
dyad_conflict // 696 // 518,480 // 0.13
asylum_rate // 483,615 // 518,480 // 93.28 !!!
----------------+-----------------------------------------------
The unit of analysis is directed-dyad year (Country A- Country B Year, Country B-Country A Year).
This includes values for dyads for all countries and years between 2000-2013.
However, some of the independent variables cut off in 2009 (exchange), 2010 (rivalry). Contiguity aka"borders the country in the dyad" ends in 2006, and thus are missing (However, in this time period most of the borders have not changed with the exception of a few nations, so there maybe a way around it).
The diplomacy and religion variables only have values for every half decade, and I'm not sure if interpolating will work, except maybe for the religion variable (diplomacy is a dummy if a diplomat from a given country visited a country, and religion includes percentages, but the dataset also has population numbers). There are not too many variables that have values until 2013 except for the DV.
As if that were not bad enough, the asylum rate is the DV.
The DV, regards granting asylum to migrants from a sending state. The asylum is granted/denied in the host state. This explains the large percentage of missing values, as you do not have individuals from every single country applying for asylum at the first instance in a given host state in a given year. However, there is still an astronomically large amount of missing values. Because it is a rate, I cannot fill in the missing values with zeros (something that other researchers have done with other migration variables when the unit of analysis is directed dyad year).
I have seen individuals use asylum data, but only in regions, or within a few cases. Even more unsettling, it is for my thesis. Do I totally have to scrap this dependent variable? I could *maybe* fill in the gaps if I add asylum applications that were appealed, but I would much rather keep it "clean" and only input first instances applications. However, right now it looks like it has to be done away with...
Comment