Hello!
I'm working on a panel dataset that groups individuals based on the id of the family (variable "nquest") and the number of order within the family (variable "nord"), to give an example, if I'm father in the family the value of nord is 1, if I'm the mother it will be 2, if the family has a child the value will be 3.
Anyways when an individual exits the family and another one enters it, the 2 individual can be confused, grouping 2 individuals under the same nquest and nord, this happens because the newcomer will get the same value of the variable nord as the individual that exits the dataset. In the dataset there is an additional variable that allows you to highlight this issue (variable "nordp"), it tells you the number of order of the individual in the previous round of the survey (even if the individual has not completed the survey in the past round). I attach the observations in the dataset for a family in order to make the problem clearer.

In this part of the dataset I added some variables that I did not talk about, which are "anno", that captures the year in which the obervation was taken, "eta", which captures the age of the individual, and "id" which is the is variable I created using the command group(nquest nord) with the goal of creating an unique identifier for each individual.
As you can see after 2006 the individual with nord==3 leaves the dataset, and in 2008 (the following wave of the panel), the individual which was the 4th in the family enters as with an order number of 3, because the other individual left. As you can see just looking at nquest and nord, we "merge" two individuals into one (indeed they have the same id).
My goal here is to create a working unique identifier (variable id) that avoids this kind of confusion within the dataset. Has anybody an idea on how to solve this issue?
If I have not been clear enough I'm more than happy to clear the doubts about this.
Thanks in advance!
I'm working on a panel dataset that groups individuals based on the id of the family (variable "nquest") and the number of order within the family (variable "nord"), to give an example, if I'm father in the family the value of nord is 1, if I'm the mother it will be 2, if the family has a child the value will be 3.
Anyways when an individual exits the family and another one enters it, the 2 individual can be confused, grouping 2 individuals under the same nquest and nord, this happens because the newcomer will get the same value of the variable nord as the individual that exits the dataset. In the dataset there is an additional variable that allows you to highlight this issue (variable "nordp"), it tells you the number of order of the individual in the previous round of the survey (even if the individual has not completed the survey in the past round). I attach the observations in the dataset for a family in order to make the problem clearer.
In this part of the dataset I added some variables that I did not talk about, which are "anno", that captures the year in which the obervation was taken, "eta", which captures the age of the individual, and "id" which is the is variable I created using the command group(nquest nord) with the goal of creating an unique identifier for each individual.
As you can see after 2006 the individual with nord==3 leaves the dataset, and in 2008 (the following wave of the panel), the individual which was the 4th in the family enters as with an order number of 3, because the other individual left. As you can see just looking at nquest and nord, we "merge" two individuals into one (indeed they have the same id).
My goal here is to create a working unique identifier (variable id) that avoids this kind of confusion within the dataset. Has anybody an idea on how to solve this issue?
If I have not been clear enough I'm more than happy to clear the doubts about this.
Thanks in advance!
Comment