Hello,
Here is what I am trying to ask Stata to do:
Look at the variable "ID", and if the same value of this variable appears in any observation in "ID_initial", then look at "pre_creation_date" variable and drop the observation with higher "pre_creation_date".
For example:
ID is "500013" in the forth observation, and it also appears in "D_initial" as the second, third and fifth observation:
500002 500001 18129
500011 500013 17615
500012 500013 17615
500013 500015 19080 drop because 19080 > 17615
500015 500013 17615
Any advice?
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long ID double ID_initial float pre_creation_date 500002 500001 18129 500011 500013 17615 500012 500013 17615 500013 500015 19080 500015 500013 17615 500025 500027 18568 500028 500027 18568 500031 500033 17974 500033 500031 18385 500035 500036 18297 500036 500035 18228 500038 500036 18297 500062 500063 17615 500071 500074 18352 500076 500078 18172 500078 500076 18850 500079 500078 18172 500083 500084 17630 500084 500083 18071 500086 500087 18939 500087 500088 17632 500089 500090 18196 500091 500090 18196 500095 500094 18928 500096 500094 18928 500098 500099 18053 500099 500100 17755 500103 500102 17647 500112 500114 18161 500114 500112 18518 500116 500114 18161 500117 500118 17668 500120 500121 18246 500121 500124 18071 500128 500124 18071 500131 507599 18175 500145 500121 18246 500171 500172 18246 500173 500172 18246 500175 500176 18556 500203 500205 18646 500204 500205 18646 500217 500205 18646 500225 500227 18277 500228 500227 18277 500229 500227 18277 500234 500235 18109 500236 500235 18109 500237 500235 18109 500581 500578 19023 500582 500578 19023 500583 500586 18219 500586 500587 18219 500587 500586 18219 500589 500590 18959 500593 500594 18108 500594 500593 17884 500625 500624 17974 500628 500674 18032 500673 500674 18032 500675 500674 18032 500681 500578 19023 500683 500685 19060 500684 500685 19060 500685 500687 18870 500687 500685 19060 500688 500687 18870 500689 500687 18870 500701 500703 18382 500702 500703 18382 500704 500703 18382 500705 500703 18382 500720 500721 18876 500722 500721 18876 500723 500721 18876 500732 527002 18253 500733 527002 18253 500773 500772 18246 500777 500779 18165 500778 500779 18165 500780 500779 18165 500781 500779 18165 500809 500811 18927 500810 500811 18927 500812 500811 18927 500814 500815 18704 500816 500815 18704 500826 500875 17821 500856 500858 18234 500857 500858 18234 500859 500858 18234 500860 500861 19033 500862 500861 19033 500863 500861 19033 500870 500872 18382 500871 500872 18382 500873 500872 18382 500876 500875 17821 500877 500875 17821 500882 500885 18102 end format %td pre_creation_date
Look at the variable "ID", and if the same value of this variable appears in any observation in "ID_initial", then look at "pre_creation_date" variable and drop the observation with higher "pre_creation_date".
For example:
ID is "500013" in the forth observation, and it also appears in "D_initial" as the second, third and fifth observation:
500002 500001 18129
500011 500013 17615
500012 500013 17615
500013 500015 19080 drop because 19080 > 17615
500015 500013 17615
Any advice?
Comment