Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape STATA

    I have a stata dataset looking something like this:

    codistat year age gend nat pop_tot

    where age goes from 0 to 100, gender is m and f.I would like to change the datset and have by codistat and year, pop_tot_0_m pop_tot_0_f pop_tot_1_m pop_tot_0_f etc..

    nat is not really relevant as of now.

    Any idea how to do it (I already tried reshape wide but it tells me the data is already wide)

  • #2
    Code:
    collapse (sum) pop_tot, by(codistat year age gend)
    will get you the information you want as a long data set, which you can easily use for almost every type of data management and analysis in Stata. You are asking to transform this to wide. That will create 200 different variables. There is almost nothing you can do with a data set like that in Stata, and even less that requires a wide layout for this kind of data. So think carefully about whether you really want to do this. But if you are sure that there is some use you can actually make use of the data in wide layout, here is how you get to that from here:
    Code:
    egen suffix = concat(age gend), punct("_")
    drop age gend
    rename pop_tot pop_tot_
    reshape wide pop_tot_, i(codistat) j(suffix) string
    Note: As no example data was provided, this code is untested and may contain errors. In the future, when asking for help with code, provide example data, and use the -dataex- command to do that. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Many Thanks, this the result of the extraction to give you an idea:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long codistat int year float age str1 gend str3 nat long pop_tot
      1001 1982 0 "m" "tot"  6
      1001 1983 0 "m" "tot"  8
      1001 1984 0 "m" "tot"  7
      1001 1985 0 "m" "tot"  5
      1001 1986 0 "m" "tot" 15
      1001 1987 0 "m" "tot"  7
      1001 1988 0 "m" "tot"  2
      1001 1989 0 "m" "tot"  9
      1001 1990 0 "m" "tot" 14
      1001 1991 0 "m" "tot"  7
      1001 1982 1 "m" "tot" 10
      1001 1983 1 "m" "tot"  8
      1001 1984 1 "m" "tot"  8
      1001 1985 1 "m" "tot"  8
      1001 1986 1 "m" "tot"  7
      1001 1987 1 "m" "tot" 13
      1001 1988 1 "m" "tot"  7
      1001 1989 1 "m" "tot"  2
      1001 1990 1 "m" "tot" 12
      1001 1991 1 "m" "tot" 13
      1001 1982 2 "m" "tot"  9
      1001 1983 2 "m" "tot" 11
      1001 1984 2 "m" "tot"  8
      1001 1985 2 "m" "tot"  7
      1001 1986 2 "m" "tot" 10
      1001 1987 2 "m" "tot"  8
      1001 1988 2 "m" "tot" 12
      1001 1989 2 "m" "tot"  7
      1001 1990 2 "m" "tot"  2
      1001 1991 2 "m" "tot" 12
      1001 1982 3 "m" "tot" 11
      1001 1983 3 "m" "tot" 10
      1001 1984 3 "m" "tot" 11
      1001 1985 3 "m" "tot"  8
      1001 1986 3 "m" "tot"  9
      1001 1987 3 "m" "tot" 11
      1001 1988 3 "m" "tot"  9
      1001 1989 3 "m" "tot" 12
      1001 1990 3 "m" "tot"  8
      1001 1991 3 "m" "tot"  1
      1001 1982 4 "m" "tot"  9
      1001 1983 4 "m" "tot"  9
      1001 1984 4 "m" "tot" 11
      1001 1985 4 "m" "tot" 12
      1001 1986 4 "m" "tot"  8
      1001 1987 4 "m" "tot"  8
      1001 1988 4 "m" "tot" 12
      1001 1989 4 "m" "tot"  9
      1001 1990 4 "m" "tot" 11
      1001 1991 4 "m" "tot"  9
      1001 1982 5 "m" "tot" 11
      1001 1983 5 "m" "tot" 11
      1001 1984 5 "m" "tot" 10
      1001 1985 5 "m" "tot" 16
      1001 1986 5 "m" "tot" 11
      1001 1987 5 "m" "tot"  9
      1001 1988 5 "m" "tot"  8
      1001 1989 5 "m" "tot" 11
      1001 1990 5 "m" "tot"  9
      1001 1991 5 "m" "tot" 11
      1001 1982 6 "m" "tot"  6
      1001 1983 6 "m" "tot" 12
      1001 1984 6 "m" "tot"  9
      1001 1985 6 "m" "tot" 10
      1001 1986 6 "m" "tot" 18
      1001 1987 6 "m" "tot" 11
      1001 1988 6 "m" "tot"  9
      1001 1989 6 "m" "tot"  7
      1001 1990 6 "m" "tot"  9
      1001 1991 6 "m" "tot" 10
      1001 1982 7 "m" "tot"  6
      1001 1983 7 "m" "tot"  5
      1001 1984 7 "m" "tot" 11
      1001 1985 7 "m" "tot" 10
      1001 1986 7 "m" "tot" 10
      1001 1987 7 "m" "tot" 16
      1001 1988 7 "m" "tot" 11
      1001 1989 7 "m" "tot"  8
      1001 1990 7 "m" "tot"  8
      1001 1991 7 "m" "tot" 11
      1001 1982 8 "m" "tot" 18
      1001 1983 8 "m" "tot"  7
      1001 1984 8 "m" "tot"  5
      1001 1985 8 "m" "tot" 11
      1001 1986 8 "m" "tot" 10
      1001 1987 8 "m" "tot" 10
      1001 1988 8 "m" "tot" 17
      1001 1989 8 "m" "tot" 12
      1001 1990 8 "m" "tot"  9
      1001 1991 8 "m" "tot"  7
      1001 1982 9 "m" "tot" 18
      1001 1983 9 "m" "tot" 16
      1001 1984 9 "m" "tot"  7
      1001 1985 9 "m" "tot"  3
      1001 1986 9 "m" "tot" 11
      1001 1987 9 "m" "tot" 10
      1001 1988 9 "m" "tot" 10
      1001 1989 9 "m" "tot" 16
      1001 1990 9 "m" "tot" 12
      1001 1991 9 "m" "tot"  8
      end

      Comment


      • #4
        I tried to implement the following command:

        collapse (sum) pop_tot, by(codistat year age gend)
        egen suffix = concat(age gend), punct("_")
        gen id=_n
        drop age gend
        rename pop_tot pop_tot_
        reshape wide pop_tot_, i(id codistat year) j(suffix) string
        drop id


        but the result was then something like this:

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input long codistat int year double(pop_tot_0_m pop_tot_1_m pop_tot_2_m pop_tot_3_m pop_tot_4_m pop_tot_5_m pop_tot_6_m pop_tot_7_m pop_tot_8_m pop_tot_9_m)
        1001 1982  6  .  .  .  .  .  .  .  .  .
        1001 1982  . 10  .  .  .  .  .  .  .  .
        1001 1982  .  .  9  .  .  .  .  .  .  .
        1001 1982  .  .  . 11  .  .  .  .  .  .
        1001 1982  .  .  .  .  9  .  .  .  .  .
        1001 1982  .  .  .  .  . 11  .  .  .  .
        1001 1982  .  .  .  .  .  .  6  .  .  .
        1001 1982  .  .  .  .  .  .  .  6  .  .
        1001 1982  .  .  .  .  .  .  .  . 18  .
        1001 1982  .  .  .  .  .  .  .  .  . 18
        1001 1983  8  .  .  .  .  .  .  .  .  .
        1001 1983  .  8  .  .  .  .  .  .  .  .
        1001 1983  .  . 11  .  .  .  .  .  .  .
        1001 1983  .  .  . 10  .  .  .  .  .  .
        1001 1983  .  .  .  .  9  .  .  .  .  .
        1001 1983  .  .  .  .  . 11  .  .  .  .
        1001 1983  .  .  .  .  .  . 12  .  .  .
        1001 1983  .  .  .  .  .  .  .  5  .  .
        1001 1983  .  .  .  .  .  .  .  .  7  .
        1001 1983  .  .  .  .  .  .  .  .  . 16
        1001 1984  7  .  .  .  .  .  .  .  .  .
        1001 1984  .  8  .  .  .  .  .  .  .  .
        1001 1984  .  .  8  .  .  .  .  .  .  .
        1001 1984  .  .  . 11  .  .  .  .  .  .
        1001 1984  .  .  .  . 11  .  .  .  .  .
        1001 1984  .  .  .  .  . 10  .  .  .  .
        1001 1984  .  .  .  .  .  .  9  .  .  .
        1001 1984  .  .  .  .  .  .  . 11  .  .
        1001 1984  .  .  .  .  .  .  .  .  5  .
        1001 1984  .  .  .  .  .  .  .  .  .  7
        1001 1985  5  .  .  .  .  .  .  .  .  .
        1001 1985  .  8  .  .  .  .  .  .  .  .
        1001 1985  .  .  7  .  .  .  .  .  .  .
        1001 1985  .  .  .  8  .  .  .  .  .  .
        1001 1985  .  .  .  . 12  .  .  .  .  .
        1001 1985  .  .  .  .  . 16  .  .  .  .
        1001 1985  .  .  .  .  .  . 10  .  .  .
        1001 1985  .  .  .  .  .  .  . 10  .  .
        1001 1985  .  .  .  .  .  .  .  . 11  .
        1001 1985  .  .  .  .  .  .  .  .  .  3
        1001 1986 15  .  .  .  .  .  .  .  .  .
        1001 1986  .  7  .  .  .  .  .  .  .  .
        1001 1986  .  . 10  .  .  .  .  .  .  .
        1001 1986  .  .  .  9  .  .  .  .  .  .
        1001 1986  .  .  .  .  8  .  .  .  .  .
        1001 1986  .  .  .  .  . 11  .  .  .  .
        1001 1986  .  .  .  .  .  . 18  .  .  .
        1001 1986  .  .  .  .  .  .  . 10  .  .
        1001 1986  .  .  .  .  .  .  .  . 10  .
        1001 1986  .  .  .  .  .  .  .  .  . 11
        1001 1987  7  .  .  .  .  .  .  .  .  .
        1001 1987  . 13  .  .  .  .  .  .  .  .
        1001 1987  .  .  8  .  .  .  .  .  .  .
        1001 1987  .  .  . 11  .  .  .  .  .  .
        1001 1987  .  .  .  .  8  .  .  .  .  .
        1001 1987  .  .  .  .  .  9  .  .  .  .
        1001 1987  .  .  .  .  .  . 11  .  .  .
        1001 1987  .  .  .  .  .  .  . 16  .  .
        1001 1987  .  .  .  .  .  .  .  . 10  .
        1001 1987  .  .  .  .  .  .  .  .  . 10
        1001 1988  2  .  .  .  .  .  .  .  .  .
        1001 1988  .  7  .  .  .  .  .  .  .  .
        1001 1988  .  . 12  .  .  .  .  .  .  .
        1001 1988  .  .  .  9  .  .  .  .  .  .
        1001 1988  .  .  .  . 12  .  .  .  .  .
        1001 1988  .  .  .  .  .  8  .  .  .  .
        1001 1988  .  .  .  .  .  .  9  .  .  .
        1001 1988  .  .  .  .  .  .  . 11  .  .
        1001 1988  .  .  .  .  .  .  .  . 17  .
        1001 1988  .  .  .  .  .  .  .  .  . 10
        1001 1989  9  .  .  .  .  .  .  .  .  .
        1001 1989  .  2  .  .  .  .  .  .  .  .
        1001 1989  .  .  7  .  .  .  .  .  .  .
        1001 1989  .  .  . 12  .  .  .  .  .  .
        1001 1989  .  .  .  .  9  .  .  .  .  .
        1001 1989  .  .  .  .  . 11  .  .  .  .
        1001 1989  .  .  .  .  .  .  7  .  .  .
        1001 1989  .  .  .  .  .  .  .  8  .  .
        1001 1989  .  .  .  .  .  .  .  . 12  .
        1001 1989  .  .  .  .  .  .  .  .  . 16
        1001 1990 14  .  .  .  .  .  .  .  .  .
        1001 1990  . 12  .  .  .  .  .  .  .  .
        1001 1990  .  .  2  .  .  .  .  .  .  .
        1001 1990  .  .  .  8  .  .  .  .  .  .
        1001 1990  .  .  .  . 11  .  .  .  .  .
        1001 1990  .  .  .  .  .  9  .  .  .  .
        1001 1990  .  .  .  .  .  .  9  .  .  .
        1001 1990  .  .  .  .  .  .  .  8  .  .
        1001 1990  .  .  .  .  .  .  .  .  9  .
        1001 1990  .  .  .  .  .  .  .  .  . 12
        1001 1991  7  .  .  .  .  .  .  .  .  .
        1001 1991  . 13  .  .  .  .  .  .  .  .
        1001 1991  .  . 12  .  .  .  .  .  .  .
        1001 1991  .  .  .  1  .  .  .  .  .  .
        1001 1991  .  .  .  .  9  .  .  .  .  .
        1001 1991  .  .  .  .  . 11  .  .  .  .
        1001 1991  .  .  .  .  .  . 10  .  .  .
        1001 1991  .  .  .  .  .  .  . 11  .  .
        1001 1991  .  .  .  .  .  .  .  .  7  .
        1001 1991  .  .  .  .  .  .  .  .  .  8
        end

        Comment


        • #5
          Remove the -gen id = _n- command, the mention of id in the -i()- option of the -reshape- command, and the -drop id- command. The id variable is what's messing things up, and it serves no purpose in any case.

          Comment


          • #6
            But otherwise stata tells me that:

            . reshape wide pop_tot_, i(codistat) j(suffix) string
            (j = 0_m 1_m 2_m 3_m 4_m 5_m 6_m 7_m 8_m 9_m)
            values of variable suffix not unique within codistat
            Your data are currently long. You are performing a reshape wide. You specified i(codistat) and j(suffix). There
            are observations within i(codistat) with the same value of j(suffix). In the long data, variables i() and j()
            together must uniquely identify the observations.

            Comment


            • #7
              Sorry, the -reshape- command should be
              Code:
              reshape wide pop_tot_, i(codistat year) j(suffix) string

              Comment


              • #8
                Many Thanks!

                Comment

                Working...
                X