Software:
OSX, Stata 13.1
Problem:
To run regressions on my data, generating a unique identifier is imperative. My data is defined as panel, covering a period from year 2007 to 2013, grades 3 through 12, and districts of multiple numbers. I want to create a unique identifier based on the three listed variables: year, grade, and district. I hope to regress my data on the variables year and cohort; these variables are described below.
Data Information and problem elaborated:
In my data, cohorts enter and exit the data set. Cohorts exiting the data can be seen as any district's grade 12 in the first year is not present in the same district in year 2 (year 1, grade 12, district 1 does not indicate the same people in year 2, grade 12, district 1). Cohorts entering the data can be seen as the lowest numbered grade (3rd grade) in any district in one year not being the same people as the next year (those people in grade 3 in one year go to grade 4 in the next year and thus year+1 needs a new cohort identifier for the lowest grade, grade 3).
An example (how I want it to look):
Year: 2007, District: 1, Grade: 3: Cohort: 1
Year: 2007, District: 1, Grade: 4: Cohort: 2
Year: 2007, District: 1, Grade: 5: Cohort: 3
Year: 2007, District: 1, Grade: 6: Cohort: 4
Year: 2007, District: 1, Grade: 7: Cohort: 5
Year: 2007, District: 1, Grade: 8: Cohort: 6
Year: 2007, District: 1, Grade: 9: Cohort: 7
Year: 2007, District: 1, Grade: 10: Cohort: 8
Year: 2007, District: 1, Grade: 11, Cohort: 9
Year: 2007, District: 1, Grade: 12: Cohort: 10
Year: 2008, District: 1, Grade: 3, Cohort: 11
Year: 2008, District: 1, Grade: 4, Cohort: 1
Year: 2008, District: 1, Grade: 5, Cohort: 2
Year: 2008, District: 1, Grade: 6, Cohort: 3
Year: 2008, District: 1, Grade: 7, Cohort: 4
Year: 2008, District: 1, Grade: 8, Cohort: 5
Year: 2008, District: 1, Grade: 9, Cohort: 6
Year: 2008, District: 1, Grade: 10, Cohort: 7
Year: 2008, District: 1, Grade: 11, Cohort: 8
Year: 2008, District: 1, Grade: 12, Cohort: 9
Variables:
Year: signifies the year an observation takes occurs, ranges from 2007 to 2013 (7 unique observation: 2007:2013, sequentially)
District: signifies the district, a numerical value signifying a specific district's identity (numbers are not sequential, they don't count 1 2 3 4...; there is a different quantity in each year)
Grade: signifies the grade number of students (covers grades 3:12, sequentially)
Cohort (desired variable to create): will be some variable that uniquely identifies a population, over time, throughout the dataset's time span
Any help is greatly appreciated and I am open to answering any questions I can.
Thank you!
OSX, Stata 13.1
Problem:
To run regressions on my data, generating a unique identifier is imperative. My data is defined as panel, covering a period from year 2007 to 2013, grades 3 through 12, and districts of multiple numbers. I want to create a unique identifier based on the three listed variables: year, grade, and district. I hope to regress my data on the variables year and cohort; these variables are described below.
Data Information and problem elaborated:
In my data, cohorts enter and exit the data set. Cohorts exiting the data can be seen as any district's grade 12 in the first year is not present in the same district in year 2 (year 1, grade 12, district 1 does not indicate the same people in year 2, grade 12, district 1). Cohorts entering the data can be seen as the lowest numbered grade (3rd grade) in any district in one year not being the same people as the next year (those people in grade 3 in one year go to grade 4 in the next year and thus year+1 needs a new cohort identifier for the lowest grade, grade 3).
An example (how I want it to look):
Year: 2007, District: 1, Grade: 3: Cohort: 1
Year: 2007, District: 1, Grade: 4: Cohort: 2
Year: 2007, District: 1, Grade: 5: Cohort: 3
Year: 2007, District: 1, Grade: 6: Cohort: 4
Year: 2007, District: 1, Grade: 7: Cohort: 5
Year: 2007, District: 1, Grade: 8: Cohort: 6
Year: 2007, District: 1, Grade: 9: Cohort: 7
Year: 2007, District: 1, Grade: 10: Cohort: 8
Year: 2007, District: 1, Grade: 11, Cohort: 9
Year: 2007, District: 1, Grade: 12: Cohort: 10
Year: 2008, District: 1, Grade: 3, Cohort: 11
Year: 2008, District: 1, Grade: 4, Cohort: 1
Year: 2008, District: 1, Grade: 5, Cohort: 2
Year: 2008, District: 1, Grade: 6, Cohort: 3
Year: 2008, District: 1, Grade: 7, Cohort: 4
Year: 2008, District: 1, Grade: 8, Cohort: 5
Year: 2008, District: 1, Grade: 9, Cohort: 6
Year: 2008, District: 1, Grade: 10, Cohort: 7
Year: 2008, District: 1, Grade: 11, Cohort: 8
Year: 2008, District: 1, Grade: 12, Cohort: 9
Variables:
Year: signifies the year an observation takes occurs, ranges from 2007 to 2013 (7 unique observation: 2007:2013, sequentially)
District: signifies the district, a numerical value signifying a specific district's identity (numbers are not sequential, they don't count 1 2 3 4...; there is a different quantity in each year)
Grade: signifies the grade number of students (covers grades 3:12, sequentially)
Cohort (desired variable to create): will be some variable that uniquely identifies a population, over time, throughout the dataset's time span
Any help is greatly appreciated and I am open to answering any questions I can.
Thank you!
Comment