Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape WHOLE dataset from long to wide

    Hello,
    I have a lot of variables to transpose from long to wide. Is there a way to transpose the whole dataset instead of picking specific variables? I have my ID and my repeated event time, but the variables I need reshaping are in the hundreds.

    reshape wide X, id(id) j(event_time)

    Thanks!

  • #2
    There is almost certainly some reasonably straightforward way to do it. But the details will depend on how your variables are named. Please use the -dataex- command and post an example of your data. You cannot give an example with hundreds of variables, but please show a representative example of perhaps half a dozen short series of variables indexed by the event time.

    If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    Comment


    • #3
      I'm inferring from your example that the many variables you want to reshape long come in series of 3, each member of the series ending in var, early, or late.

      You also need to have an id variable that identifies each observation uniquely. Your first post suggested that you have one, but your example does not. The code below creates one--skip that line if you already have it.

      Code:
      gen long id = _n
      
      ds *early
      local stubs `r(varlist)'
      local stubs: subinstr local stubs "early" "", all
      reshape long `stubs', i(id) j(event_time) string

      Comment


      • #4
        Did you try to run the code in post #4?

        Reshape can do exactly what you ask for, but the precise code depends on your data.
        The suggestion by Clyde works if your variables come in groups of, for example:
        Code:
        first60_decelvar first60_decelearly first60_decellate first61_decelvar first61_decelearly first61_decellate
        And then it does not matter how many groups there are.

        If there is a different logic to your variable names, you need to tell us more.
        Provide a dataex example with at least 2 'groups' of variables, or something that will make it clear if there are any patterns in your variable names.

        Comment


        • #5
          Still some questions remain.
          With your new data example, I'm guessing there is not much of a clear pattern in variable names.
          In your fist post, you say you have id and a repeated event time.

          So I am guessing you have two variables, id and event_time, plus some other variables.
          In that case, you probably do not want:
          Code:
          reshape long variable_stubs, i(id) j(event_time) string
          But rather:
          Code:
          reshape long variable_stubs, i(id event_time) j(some_varname)string
          With below example:
          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float(id event_time) byte(e2_cat e3_decel) float first60_totarea byte first60_recurrentprolong
          1 2001  2  1      . 0
          1 2002  2  1      . 0
          1 2003  2  1      . 0
          1 2004  2  1      . 0
          1 2005  2  1      . 0
          2 2001  2  1      . 0
          2 2002  2  1      . 0
          2 2003  2  1      . 0
          2 2004  2  1      . 0
          2 2005  2  1      . 0
          3 2001  2  1      . 0
          3 2002  2  1      . 0
          3 2003  2  1      . 0
          3 2004  2  1      . 0
          3 2005  2  1      . 0
          4 2001 99 99 6258.5 0
          4 2002 99 99 6258.5 0
          4 2003 99 99 6258.5 0
          4 2004 99 99 6258.5 0
          4 2005 99 99 6258.5 0
          end
          
          * Approximate code to reshape
          ren * X*
          ren Xid id
          ren Xevent_time event_time
          reshape long X, i(id event_time) j(var_name) string
          ren X value
          And again, the more details you give us about you data, the easier it is to answer you question. If I did not understood correctly, please explain better.

          Comment


          • #6
            Ah, you want to reshape wide, not long.
            Then I have to say I doubt that is going to be a useful exercise. You say you currently have circa hundreds of variables. Lets say it's 100 exactly. If you then have 24 values of event_time, you will end up with 2400 variables.
            That's probably not what you want?

            Comment


            • #7
              Starting from the data example as in post #8:
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float(id event_time) byte(e2_cat e3_decel) float first60_totarea byte first60_recurrentprolong
              1 2001  2  1      . 0
              1 2002  2  1      . 0
              1 2003  2  1      . 0
              1 2004  2  1      . 0
              1 2005  2  1      . 0
              2 2001  2  1      . 0
              2 2002  2  1      . 0
              2 2003  2  1      . 0
              2 2004  2  1      . 0
              2 2005  2  1      . 0
              3 2001  2  1      . 0
              3 2002  2  1      . 0
              3 2003  2  1      . 0
              3 2004  2  1      . 0
              3 2005  2  1      . 0
              4 2001 99 99 6258.5 0
              4 2002 99 99 6258.5 0
              4 2003 99 99 6258.5 0
              4 2004 99 99 6258.5 0
              4 2005 99 99 6258.5 0
              end
              
              tempfile merged
              
              preserve
              keep id event_time e2_cat
              reshape wide e2_cat, i(id) j(event_time)
              save `merged'
              restore
              
              
              foreach item of varlist e3_decel-first60_recurrentprolong{
              preserve
              keep id event_time `item'
              reshape wide `item', i(id) j(event_time)
              merge 1:1 id using `merged', nogen
              save `merged', replace
              restore
              }
              use `merged', clear
              Run that example, with this example dataset, and see if this is really what you want.

              Comment


              • #8
                You can't make it the prefix because all Stata variable names must begin with a letter or underscore (_) character; initial digits are not allowed.

                If you would like to have the number appear before the name, but perhaps preceded and followed by an underscore the -reshape- command would look something like this:

                Code:
                reshape wide _@_`item', i(id) j(event_time)
                As an aside, I will just remind you of Jorrit Gossens' good advice in #10 that doing this -reshape- to wide may not be a good idea. Most Stata commands work best, and some work only at all, with long data. You may well find this wide layout to be a hindrance in your subsequent work.

                Comment


                • #9
                  Please dont delete your messages from future questions.
                  This forum is not only a way to get help on your specific question, its also a reference for others with similar questions.

                  Comment

                  Working...
                  X