Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to dissect multiple variables in one column to separate columns

    Hello good Sir/Madam,

    I am currently working with a list of corporations and their CashFlowTypes, and I wish to see the effect of a certain type of cash flows on a dependent variable. For now, each CashFlowType makes up a new row per corporation, with an adjacent column stating the Amount of that CashFlowType, so say there are 6 types of cash flow, I have 6 rows per corporation. What I would want is 1 row per corporation with 6 different columns, each describing their respective cash flow variable.

    Is this possible? If so, with what code could this be accomplished?

    I included a picture so you may better understand what I am trying to bring across with my limited programming knowledge. The string variable is in Dutch if you were wondering.

    My sincerest thanks in advance!

    Jan Paul
    Click image for larger version

Name:	Stata help.PNG
Views:	1
Size:	37.3 KB
ID:	1508048

  • #2
    You need -reshape wide-, but you will need a variable to hold a corporation identifier (otherwise, how do we know to which corporation each row belongs?). I assume it's called IdCorp.

    You must first encode the CashFlowType variable, as its values are going to be part of the names of the new columns created.

    Here is a possibility:

    Code:
    gen type=1 if CashFlowType=="Sectorspecifieke heffing onafhankelijk van resultaat"
    replace type=2 if CashFlowType=="Huren Maatschappelijk onroerend goed"
    replace type=3 if CashFlowType=="Erfpacht kasstroom"
    replace type=4 if CashFlowType=="Vennootschapsbelasting"
    ...
    
    reshape wide Amount, i(IdCorp) j(type)
    This will create variable Amount1, Amount2... You may then change variable labels, and maybe also variable names.

    Of course you may use other values for the 'type' variable, and you may use strings as well. I you use string, keep them short, without spaces, and add the option 'string' to reshape wide:

    Code:
    reshape wide Amount, i(IdCorp) j(type) string
    Last edited by Jean-Claude Arbaut; 17 Jul 2019, 05:13.

    Comment


    • #3
      Thank you sir Arbaut. It did not completely work but I think I'm nearly there. I get the error message stating that there are multiples of the same type per corporation id (called AW_Instellingsnummer in my Dutch data). I believe this is because I have two years of information of the same corporation, such that each CashFlowType (called KasstroomPost in my Dutch data) may exists twice for each corporation id.

      How should I handle this?

      pic of error:
      Click image for larger version

Name:	Stata help 2.PNG
Views:	1
Size:	19.2 KB
ID:	1508066


      and a pic of the data:
      Click image for larger version

Name:	Stata help 3.PNG
Views:	1
Size:	62.8 KB
ID:	1508067

      Comment


      • #4
        Then use both the corporation identifier and the year as index variables:

        Code:
        reshape wide Amount, i(IdCorp Year) j(type)

        Comment


        • #5
          Originally posted by Jean-Claude Arbaut View Post
          Then use both the corporation identifier and the year as index variables:

          Code:
          reshape wide Amount, i(IdCorp Year) j(type)
          Thanks a bunch! You're a legend.

          Comment

          Working...
          X