Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape data from long to wide

    Hi.

    I have little experience with #reshape and am stuck with the following example:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double Deal_No int Deal_Announced_FY str51 Acq_Name long Salary float FY_Report
    2093107020 2010 "Monster Worldwide" 1000 2010
    2093107020 2010 "Monster Worldwide" 1000 2009
    2093107020 2010 "Monster Worldwide" 1000 2008
    2093107020 2010 "Monster Worldwide"  723 2007
    2100857020 2010 "Perrigo"            919 2010
    2100857020 2010 "Perrigo"            875 2009
    2100857020 2010 "Perrigo"            775 2008
    2107585020 2013 "AT&T"              1633 2013
    2107585020 2013 "AT&T"              1550 2012
    2107585020 2013 "AT&T"              1550 2011
    2107585020 2013 "AT&T"              1533 2010
    2144165020 2010 "Noble Energy"       557 2010
    2144165020 2010 "Noble Energy"       564 2009
    2145252020 2010 "Hillenbrand"          .    .
    2145453020 2010 "Berry Petroleum"      .    .
    2145536020 2010 "Boeing"            1930 2010
    2145536020 2010 "Boeing"            1930 2009
    2145536020 2010 "Boeing"            1915 2008
    2145536020 2010 "Boeing"            1800 2007
    end
    format Deal_No %10.0g
    The Variable FY_Report reports the Salary for the CEO of the Company (Acq_Name) up to 3 years before the Year in Deal_Announced_FY.
    I would like to reshape the data in a way so that four new variables are created whereas each variable contains the data of the salary in a respective year (relative to the year in Deal_Announced_FY).
    For the first case, Monster Worldwide, there would be a variable Salaray_0 that is equal to the salary in 2010, so 1000; a variable Salaray_1 that is equal to the salary in 2009, so 1000, a variable Salaray_2 that is equal to the salary in 2008, so 1000 and a variable Salaray_3 that is equal to the salary in 2007, so 723.


    The general syntax to go from long to wide would be:
    Code:
    reshape wide stub, i(i) j(j)
    
    I tried
    reshape long FY_, i(Deal_No) j(Sal)
    I actually thought that I have long data but Stata keeps telling me my data is already wide and at this point I am not even sure if reshape is the right command to achieve what I am looking for.

    Any help is much appreciated
    Kind regards



  • #2
    1. if you are going from long to wide then you have a typo in the command you tried as that is the command for going from wide to long; you could, instead, try "reshape wide ..."

    2. however that will not work because you have missing values in your "j()" variable; if might be possible to finesse this by forming a new j variable bu I don't have a good enough understanding of your data to suggest how to do this

    Comment


    • #3
      Perhaps something like this: (I address the problem in point 2 of #2, i.e. you have missing values in FY_Report, by just dropping those observations; you may have a better way of dealing with this issue)

      Code:
      gen byte FY_gap = Deal_Announced_FY - FY_Report
      drop if missing(FY_Report)
      drop FY_Report
      reshape wide Salary, i(Deal_No Deal_Announced_FY Acq_Name) j(FY_gap)
      which produces:
      Code:
      . format %10.0f Deal_No
      . li , noobs
        +-----------------------------------------------------------------------------------+
        |    Deal_No   Deal_A~Y            Acq_Name   Salary0   Salary1   Salary2   Salary3 |
        |-----------------------------------------------------------------------------------|
        | 2093107020       2010   Monster Worldwide      1000      1000      1000       723 |
        | 2100857020       2010             Perrigo       919       875       775         . |
        | 2107585020       2013                AT&T      1633      1550      1550      1533 |
        | 2144165020       2010        Noble Energy       557       564         .         . |
        | 2145536020       2010              Boeing      1930      1930      1915      1800 |
        +-----------------------------------------------------------------------------------+
      Last edited by Hemanshu Kumar; 14 Dec 2022, 07:59.

      Comment


      • #4
        Thank you.
        This works just fine when applied to the sample data. Unfortunately, in the full dataset I receive the following error:
        values of variable FY_gap not unique within Deal_No Deal_Announced_FY Acq_Name
        Your data are currently long. You are performing a reshape wide. You specified
        i(Deal_No Deal_Announced_FY Acq_Name) and j(FY_gap). There are observations within
        i(Deal_No Deal_Announced_FY Acq_Name) with the same value of j(FY_gap). In the long
        data, variables i() and j() together must uniquely identify the observations.
        I will dive deeper into the data a try to figure out how to solve this.

        Comment

        Working...
        X