Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape with many variables

    Dear All,

    I have a very large dataset (approx. 29,000 variables) which looks like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid byte(istrtdatd91 lkmovy91) int doby91 byte(opfamn92 hsowr1493 hhch1294 hsowr2_bh95 opfamn96 tujbpl97)
    10002251  9 -8 1899  .  .  .  .  .  .
    10004491 10 -8 1963  1  .  .  .  .  .
    10004521 10 -8 1965  3 -8  .  .  .  .
    10007857 14  4 1933  3 -8  .  .  .  .
    10014578 16 -8 1937  4  0  .  1  5  .
    10014608 23 -8 1934  4  0  .  1  5  .
    10016813 23 19 1955  5 -7  2  .  . -8
    10016848 23 -7 1959  3 -7  2  .  . -8
    10016872  .  .    .  .  .  .  .  .  .
    10017933 13  1 1942  3  0  2  2  3 -8
    10017968 14  8 1945  5  0  2  2  .  .
    end
    The variables are repeated across time in a range between 91 (1991) and 16 (2016), and pid is the identifier.

    I would like to reshape the dataset to a long format. However, with such a huge number of variables it would take ages to input all the names manually. As a consequence, reading other Statalist topics, I tried these commands:

    Code:
    ds pid, not  
    //
    local stubs `r(varlist)'
    //
     forvalues i = 91/16 {      
    local stubs: subinstr local stubs "`i'" "", all
    }  
    //
    local stubs: list uniq stubs
    // 
    reshape long `stubs', i(pid) j(year)
    but I get the following error message:

    characteristic contents too long
    The maximum value of the contents is 67,784.
    r(1004);

    My goal was to polish the dataset after the reshape as it would be much easier. Do you have any suggestion?

    Thank you very much in advance for your help.

    Kind regards

    Giovanni Angioni
    Last edited by Giovanni Angioni; 18 Oct 2019, 11:08.

  • #2

    I can't see forval liking 91/16, because it's not an increasing sequence and in any case my guess is that you need to loop over 00 01 ... 09 10 among other values. Here is my guess.

    Code:
    local stubs  
    
    foreach y of num 91/99 0/16 {
        local Y : di %02.0f `y'
        unab this : *`Y'
        local this : subinstr local this "`Y'" "", all
        local stubs : list stubs | this
    }
      
    reshape long `stubs', i(pid) j(year)

    Comment


    • #3
      Dear Nick,

      Thank you very much for your help! Much appreciated.

      Using the commands you suggested I realized several variables ended with numbers before the year suffix, and hence there were several j(years) outside the 1991-2016 range.
      I therefore renamed all the variables adding an underscore and changed the range to 1991-2016 instead of 91-16 to solve the issue you highlighted.

      Finally, modifying the first line of foreach with the new 1991-2016 range I was able to complete the reshape.

      Many thanks again!

      Best wishes,
      Giovanni

      Comment

      Working...
      X