Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reshape with 4000 variables?

    Dear All,
    suppose you have data of the following format:

    Code:
    name   return_36525  return_36526    ...     return_42369
    banka        12             15       ...          48
    bankb        NA             15       ...          20
    ...
    I would like to reshape as panel. I am using:

    Code:
    reshape long return_, i(name) j(exceldatetime)

    This works, but Stata only recognizes the first 250 "variables" (which are really dates). How can I have it reshape all 42369-36526 variables into dates?

    Thanks in advance!

    Best,

  • #2
    Jannic,

    What version of Stata are you using? And exactly what error message are you getting? I'm not aware of an inherent limitation in reshape like this.

    Regards,
    Joe

    Comment


    • #3
      We can better help you if we know not only the command you tried but also what Stata told you that indicated that there was a problem. Can you post the output from the reshape command? Or better still, the results from the following three commands.
      Code:
      describe, short
      reshape long return_, i(name) j(exceldatetime)
      describe, short

      Comment


      • #4
        One more question, while we're at it: in what sense are the variables return_36525-return_42369 "dates"? In Stata, these numbers represent dates 1/1/2060 to 1/1/2076, which i doubt is what you want.

        Incidentally, I generated a data set similar to what you described and the reshape ran just fine. I am using Stata 14/MP.

        Comment


        • #5
          Hey!
          There is no error message that is the thing! But it simply uses the first 250 "variables" and then just stops.

          The dates are in excel datetime, but can be converted later on rather easily I suppose.

          Wait let me run the code again, Ill copy the output in second.

          I am using Stata 12 on a Mac.

          Comment


          • #6
            so here the output and I attached a screenshot of the data browser. You can see how it cuts off after 36881 which is consistent with the output in the console, but it is unfortunately not what I want.


            Code:
            . reshape long return_, i(isin) j(exceldatetime)
            (note: j = 36525 36528 36529 36530 36531 36532 36535 36536 36537 36538 36539 36542
            >  36543 36544 36545 36546 36549 36550 36551 36552 36553 36556 36557 36558 36559 3
            > 6560 36563 36564 36565 36566 36567 36570 36571 36572 36573 36574 36577 36578 365
            > 79 36580 36581 36584 36585 36586 36587 36588 36591 36592 36593 36594 36595 36598
            >  36599 36600 36601 36602 36605 36606 36607 36608 36609 36612 36613 36614 36615 3
            > 6616 36619 36620 36621 36622 36623 36626 36627 36628 36629 36630 36633 36634 366
            > 35 36636 36637 36640 36641 36642 36643 36644 36647 36648 36649 36650 36651 36654
            >  36655 36656 36657 36658 36661 36662 36663 36664 36665 36668 36669 36670 36671 3
            > 6672 36675 36676 36677 36678 36679 36682 36683 36684 36685 36686 36689 36690 366
            > 91 36692 36693 36696 36697 36698 36699 36700 36703 36704 36705 36706 36707 36710
            >  36711 36712 36713 36714 36717 36718 36719 36720 36721 36724 36725 36726 36727 3
            > 6728 36731 36732 36733 36734 36735 36738 36739 36740 36741 36742 36745 36746 367
            > 47 36748 36749 36752 36753 36754 36755 36756 36759 36760 36761 36762 36763 36766
            >  36767 36768 36769 36770 36773 36774 36775 36776 36777 36780 36781 36782 36783 3
            > 6784 36787 36788 36789 36790 36791 36794 36795 36796 36797 36798 36801 36802 368
            > 03 36804 36805 36808 36809 36810 36811 36812 36815 36816 36817 36818 36819 36822
            >  36823 36824 36825 36826 36829 36830 36831 36832 36833 36836 36837 36838 36839 3
            > 6840 36843 36844 36845 36846 36847 36850 36851 36852 36853 36854 36857 36858 368
            > 59 36860 36861 36864 36865 36866 36867 36868 36871 36872 36873 36874 36875 36878
            >  36879 36880 36881)
            
            Data                               wide   ->   long
            -----------------------------------------------------------------------------
            Number of obs.                       44   ->   11220
            Number of variables                 256   ->       3
            j variable (255 values)                   ->   exceldatetime
            xij variables:
            return_36525 return_36528 ... return_36881->   return_
            -----------------------------------------------------------------------------
            
            
            
            Click image for larger version

Name:	Screen Shot 2016-06-09 at 4.37.00 PM.png
Views:	1
Size:	163.0 KB
ID:	1344643

            Comment


            • #7
              wait my bad! The problem is that Stata only imports the first 256 columns of my excel file! I don't know why, but it works now that I imported as cvs! Thanks guys!

              Comment


              • #8
                I again recommend running describe, short before the reshape as I suggested at post #3 above, and sharing those results with us. It is not without possibility that there are only 256 variables in your dataset and your problem is something that happened before the reshape.
                Last edited by William Lisowski; 09 Jun 2016, 09:12. Reason: This crossed with the previous post.

                Comment


                • #9
                  Well, I tried it and came up with a somewhat different problem:

                  Code:
                  . set more off
                  
                  . clear*
                  
                  . 
                  . set maxvar 7000
                  
                  
                  . set obs 25
                  number of observations (_N) was 0, now 25
                  
                  . gen name = ""
                  (25 missing values generated)
                  
                  . forvalues i = 1/25 {
                    2.         local letter: word `i' of `c(alpha)'
                    3.         replace name = "bank`letter'" in `i'
                    4. }
                  variable name was str1 now str5
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  (1 real change made)
                  
                  .         
                  . set seed 1234
                  
                  . forvalues i = 36525/42369 {
                    2.         gen return_`i' = runiform()
                    3. }
                  
                  . 
                  . reshape long return_, i(name) j(exceldatetime)
                  (note: j = 36525 36526 36527 36528 36529 36530 36531 36532 36533 36534 36535 36536 36537 36538 36539 36540 36541 36542 36543 36544 36545 36546 3654
                  > 7 36548 36549 36550 36551 36552 36553 36554 36555 36556 36557 36558 36559 36560 36561 36562 36563 36564 36565 36566 36567 36568 36569 36570 36571
                  >  36572 36573 36574 36575 36576 36577 36578 36579 36580 36581 36582 36583 36584 36585 36586 36587 36588 36589 36590 36591 36592 36593 36594 36595 
                  > 36596 36597 36598 36599 36600 36601 36602 36603 36604 36605 36606 36607 36608 36609 36610 36611 36612 36613 36614 36615 36616 36617 36618 36619 3
                  > 6620 36621 36622 36623 36624 36625 36626 36627 36628 36629 36630 36631 36632 36633 36634 36635 36636 36637 36638 36639 36640 36641 36642 36643 36
                  > 644 36645 36646 36647 36648 36649 36650 36651 36652 36653 36654 36655 36656 36657 36658 36659 36660 36661 36662 36663 36664 36665 36666 36667 366
                  > 68 36669 36670 36671 36672 36673 36674 36675 36676 36677 36678 36679 36680 36681 36682 36683 36684 36685 36686 36687 36688 36689 36690 36691 3669
                  ... [output shortened to fit within limits of a Forum post]
                  > 2275 42276 42277 42278 42279 42280 42281 42282 42283 42284 42285 42286 42287 42288 42289 42290 42291 42292 42293 42294 42295 42296 42297 42298 42
                  > 299 42300 42301 42302 42303 42304 42305 42306 42307 42308 42309 42310 42311 42312 42313 42314 42315 42316 42317 42318 42319 42320 42321 42322 423
                  > 23 42324 42325 42326 42327 42328 42329 42330 42331 42332 42333 42334 42335 42336 42337 42338 42339 42340 42341 42342 42343 42344 42345 42346 4234
                  > 7 42348 42349 42350 42351 42352 42353 42354 42355 42356 42357 42358 42359 42360 42361 42362 42363 42364 42365 42366 42367 42368 42369)
                  
                  Data                               wide   ->   long
                  -----------------------------------------------------------------------------
                  Number of obs.                       25   ->  146125
                  Number of variables                5846   ->       3
                  j variable (5845 values)                  ->   exceldatetime
                  xij variables:
                  return_36525 return_36526 ... return_42369->   return_
                  characteristic contents too long
                      The maximum value of the contents is 67,784.
                  r(1004);
                  
                  end of do-file
                  
                  r(1004);
                  However, despite the error message, the -reshape- actually ran more or less successfully [data not shown]. By "more or less" I mean that the data are now correctly arrayed in long layout, but some of the _dta characteristics that -reshape- normally sets so it can "remember" what it last did, are not defined.

                  I'm running Stata 14.1 MP, 64 bit on a Windows 7 machine.

                  Comment


                  • #10
                    Clyde Schechter & @William Lesovski: The problem was that the first 256 variables we encoded as strings (because some cells were "NA" which ideally Stata should have translated into "."). I converted the input file in a way that "NA" became ".". from there on it worked just fine.

                    I guess that is the sort of problem where the described command suggested by William would have help me catch the error quicker!

                    Thanks again!

                    Comment

                    Working...
                    X