Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • reshape error

    Hello,

    To preface this, I am relatively new at Stata, but I have been reading the documentation. I have a data set, of 122 vars, 7139 obs, and am attempting to reshape my panel data, using
    Code:
    reshape wide * , i(Year) j(UID)
    running this returns
    Code:
    variable UID contains all missing values
    And many forum replies to this error state to use the string tag, but that returns
    Code:
    reshape wide * , i(Year) j(UID) string
    variable UID is numeric
    r(109);
    One post that seemed relatively similar to mine stated that a fix would be to rename the variables somehow...? (I don't really understand any of this.)
    https://www.statalist.org/forums/for...-to-long-error

    Either way, I would appreciate any help. Thanks!



    for reference, here is a dataex description of a sample of my data, including my i(Year) j(UID) vars and 2 data vars.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int Year long(UID UGDS) byte HasRelaffil_Flag
    2012 168528 1766 1
    2014 168528 1570 1
    2016 168528 1621 1
    2008 168528 1469 1
    2010 168528 1652 1
    2018 100724 3903 0
    2004 138716 3174 0
    2006 138716 3474 0
    2008 138716 3711 0
    2010 138716 4154 0
    2012 138716 3755 0
    2014 138716 3261 0
    2016 138716 2576 0
    2018 168528 1781 1
    2004 100654 5071 0
    2006 100654 4958 0
    2008 100654 4290 0
    2010 100654 4930 0
    2012 100654 4081 0
    2014 100654 4206 0
    2016 100654 4616 0
    2018 100654 4990 0
    2004 100724 4624 0
    2006 100724 4584 0
    2008 100724 4730 0
    2010 100724 4882 0
    2012 100724 5123 0
    2014 100724 4811 0
    2016 100724 4704 0
    2018 138716 5545 0
    end

  • #2
    I'm not entirely sure why, but instead the asterisk, if you put in the actual variable names it appears to work fine (but I have no idea why you want to reshape wide, which is usually a bad idea in Stata):
    Code:
    reshape wide UGDS HasRelaffil_Flag , i(Year) j(UID)
    actually, I do have a guess: Stata is getting confused and thinks it is also supposed to reshape the UID variable which, of course, makes no sense

    Comment


    • #3
      Originally posted by Rich Goldstein View Post
      I'm not entirely sure why, but instead the asterisk, if you put in the actual variable names it appears to work fine (but I have no idea why you want to reshape wide, which is usually a bad idea in Stata):
      Code:
      reshape wide UGDS HasRelaffil_Flag , i(Year) j(UID)
      actually, I do have a guess: Stata is getting confused and thinks it is also supposed to reshape the UID variable which, of course, makes no sense
      Ahhh that makes a lot of sense thank you so much!

      Also I hear a few people mentioning reshape wide is a bad idea in stata, and I suspect this is because then the variable needs to be named explicitly, which is a pain. That being said I'm not fluent in stata notation and am probably going to end up using extremely inefficient for loops to reference variable names by an index. Advice on that aside from by / bysort would be appreciated as well, thanks!

      Comment


      • #4
        I'm not sure I understand but the reason wide is generally not a good idea in Stata is that most estimation commands want the data to be in long format; in addition, many data management tasks are much easier when the data is in long form
        Last edited by Rich Goldstein; 03 Aug 2020, 14:45.

        Comment


        • #5
          The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data, as Rich (certainly an experienced user!) has pointed out.

          You should try to achieve what you need with the data organized as it currently is, and seek the help of Statalist in doing so. The sort of problems you will encounter trying to use your reshaped data will almost certainly be solved by reshaping the data. It is much easier, for example, to compare the second observation to the first, the third to the second, and so on, than it is to compare the second variable to the first, the third to the second, etc.

          And if you do use a wide layout, and pose a problem to Statalist, don't be surprised if the answer offered involves reshaping your data into a long layout as the first step.

          Comment

          Working...
          X