Hello. I am working with some public school attendance data. The way it is organized after downloading is such that every combination of gender/race/year is a new column, with just one mere row. With 12 grades, 5 years, and a bunch of race/gender breakdowns, there ends up being over 1,000 columns. It is a mess!
I think my best move here would be to have each row be a year-grade (so 2017 Kindergarten, 2017 1st, 2017 2nd, 2018 Kindergarten, 2018 1st, and so on), and then each column would be the race/gender breakdown (Black male, White female, Hispanic female, etc.).
However, I'm having some trouble with reshaping, I think possibly because of the way the variables are named?
In the data below, k_ is for kindergarten, g1_ grade 1, g2_ grade 2, and so on. And if anyone has other suggestions for how to better deal with this headache, I am all ears! Thank you!
I think my best move here would be to have each row be a year-grade (so 2017 Kindergarten, 2017 1st, 2017 2nd, 2018 Kindergarten, 2018 1st, and so on), and then each column would be the race/gender breakdown (Black male, White female, Hispanic female, etc.).
However, I'm having some trouble with reshaping, I think possibly because of the way the variables are named?
In the data below, k_ is for kindergarten, g1_ grade 1, g2_ grade 2, and so on. And if anyone has other suggestions for how to better deal with this headache, I am all ears! Thank you!
Code:
* Example generated by -dataex-. For more info, type help dataex clear input int(k_black_female_2018_19 k_hispanic_female_2017_18 k_black_female_2016_17 k_white_female_2013_14 g1_white_male_2016_17 g1_white_male_2015_16 g1_white_male_2014_15 g1_white_male_2013_14 g1_white_female_2013_14 g2_white_female_2013_14 g2_black_female_2014_15) 1825 1916 1952 1832 1695 1792 1812 1902 1055 1043 1736 end
Comment