I have a data set and want to identify overlapping observations. I define overlapping observations as those looking at the same country and where the time range overlaps. I have three variables to do the clustering. A variable called 'countryid' which assigns an ID to each possible country an observation refers to (e.g., countryid = 1 for US data). The two other variables are the 'startyear' and a variable called 'endear'. Those variables define the time span a variable refers to.
Now I want to create a new variable 'overlap' which takes the same value of all observations from the same country and with overlapping time span. E.g., if observation1 is from the US for 1991-1998 and observation2 is also from US for 1996-2000, the variable 'overlap' would have the same value for both observations. In contrast, if the country for the second observation would be Italy or the time span would be from 2001-2003, the cluster variables should be different. In summary, I want to define a cluster variable for all observations from the same country with overlapping time span.
Can anyone help me to implement this in Stata?
Many thanks!
Now I want to create a new variable 'overlap' which takes the same value of all observations from the same country and with overlapping time span. E.g., if observation1 is from the US for 1991-1998 and observation2 is also from US for 1996-2000, the variable 'overlap' would have the same value for both observations. In contrast, if the country for the second observation would be Italy or the time span would be from 2001-2003, the cluster variables should be different. In summary, I want to define a cluster variable for all observations from the same country with overlapping time span.
Can anyone help me to implement this in Stata?
Many thanks!
Comment