Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Wide to long with monthly data

    Hi guys,

    i have a data set where one row represents one person in the survey, i.e. wide format. Now i want to change it to long format. My data set at the moment looks like this
    id mljan2005 mlfeb2005 emjan2005 emfeb2005 educ
    1234 1 1 0 0 10
    478 0 1 1 0 10

    and i would like to have it like this
    id year month ml em educ
    1234 2005 jan 1 0 10
    1234 2005 feb 1 0 10
    478 2005 jan 0 1 10
    478 2005 feb 1 0 10

    ml and em refer to different labour market states while educ is time-constant education. I can cope with the reshape command when it comes to a situation where only a year variable is indicated like inc2005 inc2006 inc2007 and so on but i fail when it comes to this month-year combination.

    Anyone got an idea?

    Thanks
    Tim

  • #2
    You have two stubs that you want to reshape long: "ml" and "em". After the reshape for those stubs, separate the month-year variable into "month" and "year" components. I found this post by Nick Cox helpful:
    http://www.stata.com/support/faqs/data-management/problems-with-reshape/
    Last edited by Josh Clothiaux; 03 May 2017, 08:26.

    Comment


    • #3
      Code:
      clear 
      input id    mljan2005    mlfeb2005    emjan2005    emfeb2005    educ
      1234    1    1    0    0    10
      478    0    1    1    0    10
      end 
      
      reshape long ml em, i(id) j(smdate) string 
      gen mdate = monthly(smdate, "MY") 
      format mdate %tm 
      
      list 
      
           +------------------------------------------+
           |   id    smdate   ml   em   educ    mdate |
           |------------------------------------------|
        1. |  478   feb2005    1    0     10   2005m2 |
        2. |  478   jan2005    0    1     10   2005m1 |
        3. | 1234   feb2005    1    0     10   2005m2 |
        4. | 1234   jan2005    1    0     10   2005m1 |
           +------------------------------------------+
      I didn't give you quite what you asked. You're better off for most purposes with a single numeric monthly date variable. If need be, you can separate out later.

      Comment


      • #4
        To get what you want you can do:

        Code:
        reshape long ml em, i(id) j(string_date)
        gen monthly_date = monthly(string_date, "MY")
        format monthly_date %tm
        gen year = yofd(dofm(monthly_date))
        gen month = mofd(dofm(monthly_date))
        This code will also create a single variable, monthly_date, in Stata internal format, that expresses both the month and year. Moving forward with your analyses, you will find that this single monthly_date variable will come in very handy, and the separate month and year variables you are requesting will probably be of minimal value.

        Note: Because you did not use -dataex- to post your data example, I did not attempt to import your data to Stata. Consequently this code is untested and may contain typos or other errors. Please install (-ssc install dataex-) and learn to use (-help dataex-) the -dataex- command and use it whenever you post a data example going forward. More generally, please read the FAQ, and especially FAQ #12 for excellent advice on the best way to show data, code, and Stata output on the forum.

        Added: Crossed with Nick's post, which proposes the same solution, and reinforces my advice that the separate month and year variables are not particularly helpful by not creating them!

        Comment


        • #5
          Hi guys,

          thanks for the answer. Nick's solution worked well for the transformation.
          Last edited by Tim Hartmann; 04 May 2017, 07:44.

          Comment


          • #6
            Hi guys,

            I am facing a similar problem. Wherein I have monthly labor data for US counties, from Jan1990 - Dec2019 in the form

            Fips Jan1990 Feb 1990 ............. Dec 2019

            1001 xxx xxx ........................xxx
            1003 xxx xxx ........................xxx
            1005 xxx xxx .......................xxx
            . . . .
            . . . .
            . . . .
            56045 xxx xxx .................... xxx

            I use the following transformation:

            rename Jan* m1Jan*

            rename Feb* m2Feb*

            rename Mar* m3Mar*

            rename Apr* m4Apr*

            rename May* m5May*

            rename Jun* m6Jun*

            rename Jul* m7Jul*

            rename Aug* m8Aug*

            rename Sep* m9Sep*

            rename Oct* m10Oct*

            rename Nov* m11Nov*

            rename Dec* m12Dec*

            ************************************************** ***********************
            gen id =_n, before(fips)

            reshape long m1 m2 m3 m4 m5 m6 m7 m8 m9 m10 m11 m12, i(id) j(Month) string

            But my command runs into a loop saying "No data found". Please do advise what am I doing wrong.
            Last edited by Anupam Ghosh; 16 Sep 2023, 23:14.

            Comment


            • #7
              A correction on the error message. The error that I get is "variable <varname> not found"

              Comment


              • #8
                I haven't tried to follow your code, as it seems that something a little simpler should do what you want. The example here is a little more concrete than yours in the absence of a use of dataex.

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input int(Fips Jan1990) long(Feb1990 Dec2019)
                1001 123    456    789
                1003 321    654    987
                1005  42 314159 271828 
                end
                
                rename (Jan1990-Dec2019) (whatever=)
                reshape long whatever, i(Fips) j(MDate) string 
                gen mdate = monthly(MDate, "MY")
                format mdate %tm 
                sort Fips mdate 
                list, sepby(Fips)
                
                     +-------------------------------------+
                     | Fips     MDate   whatever     mdate |
                     |-------------------------------------|
                  1. | 1001   Jan1990        123    1990m1 |
                  2. | 1001   Feb1990        456    1990m2 |
                  3. | 1001   Dec2019        789   2019m12 |
                     |-------------------------------------|
                  4. | 1003   Jan1990        321    1990m1 |
                  5. | 1003   Feb1990        654    1990m2 |
                  6. | 1003   Dec2019        987   2019m12 |
                     |-------------------------------------|
                  7. | 1005   Jan1990         42    1990m1 |
                  8. | 1005   Feb1990     314159    1990m2 |
                  9. | 1005   Dec2019     271828   2019m12 |
                     +-------------------------------------+

                Comment


                • #9
                  Thanks, Nick! This worked great.

                  Comment

                  Working...
                  X