Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape many columns with only one row

    Hello. I am working with some public school attendance data. The way it is organized after downloading is such that every combination of gender/race/year is a new column, with just one mere row. With 12 grades, 5 years, and a bunch of race/gender breakdowns, there ends up being over 1,000 columns. It is a mess!

    I think my best move here would be to have each row be a year-grade (so 2017 Kindergarten, 2017 1st, 2017 2nd, 2018 Kindergarten, 2018 1st, and so on), and then each column would be the race/gender breakdown (Black male, White female, Hispanic female, etc.).

    However, I'm having some trouble with reshaping, I think possibly because of the way the variables are named?

    In the data below, k_ is for kindergarten, g1_ grade 1, g2_ grade 2, and so on. And if anyone has other suggestions for how to better deal with this headache, I am all ears! Thank you!



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(k_black_female_2018_19 k_hispanic_female_2017_18 k_black_female_2016_17 k_white_female_2013_14 g1_white_male_2016_17 g1_white_male_2015_16 g1_white_male_2014_15 g1_white_male_2013_14 g1_white_female_2013_14 g2_white_female_2013_14 g2_black_female_2014_15)
    1825 1916 1952 1832 1695 1792 1812 1902 1055 1043 1736
    end

  • #2
    EDIT: Scratched this approach. See post #3!
    Last edited by Ken Chui; 26 Apr 2021, 11:13.

    Comment


    • #3
      This may help:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int(k_black_female_2018_19 k_hispanic_female_2017_18 k_black_female_2016_17 k_white_female_2013_14 g1_white_male_2016_17 g1_white_male_2015_16 g1_white_male_2014_15 g1_white_male_2013_14 g1_white_female_2013_14 g2_white_female_2013_14 g2_black_female_2014_15)
      1825 1916 1952 1832 1695 1792 1812 1902 1055 1043 1736
      end
      
      gen id = _n 
      
      reshape long k_ g1_ g2_, i(id) j(which) string 
      
      split which, parse(_) destring
      drop which4
      drop which
      rename (which?) (race gender year)
      
      list

      Comment


      • #4
        Code:
        split which, parse(_) destring
        ^ Another cool trick! Thanks, Nick Cox! I was still struggling with the parsing with regular expression...

        Comment


        • #5
          Yes, that is a great trick! Thanks, Nick Cox !

          Comment

          Working...
          X