I have data in the following form
I would like to create every possible personid-charityid pair in the data. The logic I envisioned would establish a start date, the minium date for each personid, and an end date, the maximum date for each personid. Then, if a charityid is open in this window (if start_date <= charity_open_date & end_date >= charity_open_date) and they did not give to this charityid (gave == 1), an entry would be created filling in the personid, charityid, and gave == 0. All other variables would be missing (could be string or float missing).
In this example, personid == 1 donated to all charities so no change.
personid == 2 could have donated to charityid == 3 and did not so it would add one observation
personid == 3 leaves the data before charityid 3 opens so there is no change to their records.
Code:
clear input personid date charityid gave charity_open_date 1 18265 1 1 18263 1 18267 2 1 18263 1 18273 3 1 18272 2 18263 1 1 18263 2 18273 2 1 18263 3 18264 1 1 18263 3 18271 2 1 18263 end
In this example, personid == 1 donated to all charities so no change.
personid == 2 could have donated to charityid == 3 and did not so it would add one observation
Code:
2 . 3 0 .
Comment