Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape long to wide: new variable for each observation

    I have a municipality-specific dataset that looks like this:
    label2 abs pc id
    Totale generale delle Entrate 23936646 37874.44 1
    Entrate correnti di natura tributaria, contributiva e perequativa 290295.81 459.32883 2
    Imposte,tasse e proventi assimilati 290295.81 459.3288 3
    Compartecipazioni di tributi 0 0 4
    I created the id variable to help with the reshaping process. We are trying to reshape this dataset into something like this:
    Totale generale delle Entrate_abs Totale generale delle Entrate_pc Entrate correnti di natura tributaria, contributiva e perequativa_abs Entrate correnti di natura tributaria, contributiva e perequativa_pc Imposte,tasse e proventi assimilati_abs Imposte,tasse e proventi assimilati_pc Compartecipazioni di tributi_abs Compartecipazioni di tributi_pc
    23936646 37874.44 290295.81 459.32883 290295.81 459.3288 0 0
    Code:
            reshape wide pc abs, i(id) j(id)
    The code does not work.

    Is there a way to tell stata that we want a new variable for each observation and that it should shorten the variable name/remove white space between words wherever required? Thank you so much!

    We have 7000+ similar CSV files titled "municipalityname-data.csv," so doing it manually is not possible. Ideally, we would reshape all these files, generate a variable for the municipality name in each file, and then append all the files. Not sure if it's the most efficient way to do this.
    Last edited by Pepa Malik; 05 Dec 2022, 13:27. Reason: reshape

  • #2
    The configuration you want as your final results is not possible in Stata because the variable names you are trying to create violate several rules constraining variable names in Stata. For example, "Entrate correnti di natura tributaria, contributiva e perequativa_abs" is not allowed because it contains both spaces and a comma, and it is also too long. The only allowed characters in Stata variable name are digits, letters, and underscores(_), and the maximum length is 32 characters. The closest you can come to what you are asking is something like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str66 label2 double abs float pc byte id
    "Totale generale delle Entrate "                                      23936646 37874.44 1
    "Entrate correnti di natura tributaria, contributiva e perequativa " 290295.81 459.3288 2
    "Imposte,tasse e proventi assimilati "                               290295.81 459.3288 3
    "Compartecipazioni di tributi "                                              0        0 4
    end
    
    replace label2 = substr(strtoname(label2), 1, 28)
    rename (abs pc) _=
    reshape wide @_abs @_pc, i(id) j(label2) string
    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      The configuration you want as your final results is not possible in Stata because the variable names you are trying to create violate several rules constraining variable names in Stata. For example, "Entrate correnti di natura tributaria, contributiva e perequativa_abs" is not allowed because it contains both spaces and a comma, and it is also too long. The only allowed characters in Stata variable name are digits, letters, and underscores(_), and the maximum length is 32 characters. The closest you can come to what you are asking is something like this:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str66 label2 double abs float pc byte id
      "Totale generale delle Entrate " 23936646 37874.44 1
      "Entrate correnti di natura tributaria, contributiva e perequativa " 290295.81 459.3288 2
      "Imposte,tasse e proventi assimilati " 290295.81 459.3288 3
      "Compartecipazioni di tributi " 0 0 4
      end
      
      replace label2 = substr(strtoname(label2), 1, 28)
      rename (abs pc) _=
      reshape wide @_abs @_pc, i(id) j(label2) string
      In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
      Thank you so much! And noted!

      Comment

      Working...
      X