Hello Stata users,
I have run into a bit of a conundrum I was hoping to get assistance on. I have a dataset with a number of variables which required transformations for normality. The issue is that I want to organize the output of these variables by decile.
Specifically, my dataset has to do with passing a test. I have the data organized by district so that all of the schools in said district share the same values for my IVS. So, for example, all of the schools in Stata district 1 have 33% African American students. I'm trying to capture the distributions of all of the schools, with each IV, within a decile. So, for example, Stata district 1 (33%), and Stata district 2 with 38% African American students would be placed into the grouping category for districts with 30-40% African American students.
I was able to use :egen AfAmer= cut(AfAmer), at (0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100) label, but because of the transformation, the output was incorrect (everything was 0-10%). I have also tried: egen AfAmer= cut( AfAmer ), group(10) and xtile q_AfAmer= AfAmer, nquantiles(10)
However, the tabulation came out rather equalized, which was not consistent with the tabulation of the raw data. So, for example, within my (raw) dataset the majority of schools either do not have many African American students (<20%) or a large concentration (>60%). Within the tabulation for the transformed and cut data most deciles have a roughly equal number of school districts represented. How can I produce a distribution, by decile, which is more consistent with the distribution found within the raw data? I want all of the school districts within the 0-10% range of transformed values for African American students to be placed into the category for 0-10%.
Please excuse me if this is unclear, I will be happy to clarify. Thanks for your help! I am using Stata 15.1
-Antonio
I have run into a bit of a conundrum I was hoping to get assistance on. I have a dataset with a number of variables which required transformations for normality. The issue is that I want to organize the output of these variables by decile.
Specifically, my dataset has to do with passing a test. I have the data organized by district so that all of the schools in said district share the same values for my IVS. So, for example, all of the schools in Stata district 1 have 33% African American students. I'm trying to capture the distributions of all of the schools, with each IV, within a decile. So, for example, Stata district 1 (33%), and Stata district 2 with 38% African American students would be placed into the grouping category for districts with 30-40% African American students.
I was able to use :egen AfAmer= cut(AfAmer), at (0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100) label, but because of the transformation, the output was incorrect (everything was 0-10%). I have also tried: egen AfAmer= cut( AfAmer ), group(10) and xtile q_AfAmer= AfAmer, nquantiles(10)
However, the tabulation came out rather equalized, which was not consistent with the tabulation of the raw data. So, for example, within my (raw) dataset the majority of schools either do not have many African American students (<20%) or a large concentration (>60%). Within the tabulation for the transformed and cut data most deciles have a roughly equal number of school districts represented. How can I produce a distribution, by decile, which is more consistent with the distribution found within the raw data? I want all of the school districts within the 0-10% range of transformed values for African American students to be placed into the category for 0-10%.
Please excuse me if this is unclear, I will be happy to clarify. Thanks for your help! I am using Stata 15.1
-Antonio
Comment