Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Long Data Issue

    I am trying to collapse this data but I have continuously gotten errors regarding the i and j not being unique. Here is a snippit of the dataset:

    country variable total
    AUS CAP 20,525
    AUS CAP 1,206
    AUS CAP 1,475
    AUS CAP 79,601
    AUS CAP 9,928
    AUS CAP 553
    AUS CAP 1,180
    AUS CAP 1,080

    ...

    AUS EMPE 57
    AUS EMPE 58
    AUS EMPE 31
    AUS EMPE 17
    AUS EMPE 61
    AUS EMPE 50


    I am trying to get it to read like this with these variables:

    Country CAP EMPE

    With the totals for each. If anyone can help, I'd appreciate it. I don't know where to start!

  • #2
    Alex, more details are helpful. If you want the data to have variables "Country", "CAP", and "EMPE", what would the observations look like? I assume, for example, the first observation is "AUS", the average of "total" within AUS and CAP, and the average of "total" within AUS and EMPE. Then the following code should work.

    Code:
    collapse total, by(country variable)
    
    reshape wide total, i(country) j(variable) string
    
    ren (totalCAP totalEMPE) (CAP EMPE)

    Comment


    • #3
      Originally posted by Fei Wang View Post
      Alex, more details are helpful. If you want the data to have variables "Country", "CAP", and "EMPE", what would the observations look like? I assume, for example, the first observation is "AUS", the average of "total" within AUS and CAP, and the average of "total" within AUS and EMPE. Then the following code should work.

      Code:
      collapse total, by(country variable)
      
      reshape wide total, i(country) j(variable) string
      
      ren (totalCAP totalEMPE) (CAP EMPE)
      Thank you for the response! As far as collapsing goes, there appears to be a type mismatch, is there a workaround for that?

      Comment


      • #4
        As far as collapsing goes, there appears to be a type mismatch, is there a workaround for that?
        Well, Stata doesn't care about typing of the variables in the -by()- option of -collapse-. So it must mean that your variable total is a string variable. You can't do any calculations with string variables--even if they look like numbers when viewed by human eyes. So you need to convert it to a real numeric variable:

        Code:
        destring total, replace ignore(",")
        Then you can run the code from #2.

        This delay in getting a correct answer, which wasted both your time and Fei Wang's, is a direct result of failing to read and follow the advice in the FAQ's about how to post example data. In #12 you will find that the -dataex- command is the appropriate way to do it. Correct code often relies on details of the data set that cannot be shown with tabular displays. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        When asking for help with code, always show example data. When showing example data, always use -dataex-.

        Comment

        Working...
        X