Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with String Variable in Collapse Command

    Dear all,

    I am using Stata 16, on mac. I must collapse my data to region-by-year level. The problem is that the variable region is a string variable and 14 regions are included under this variable such as, Western Europe, North America, Australia and New Zealand. Under the year variable there are three years: 2015, 2016, and 2017.

    I tried using collapse region, by(year), which resulted in the error message
    type mismatch
    r(109)

    I would greatly appreciate it if someone could help me fix my command.

    Thank you in advance for your help

    Jason Browen

  • #2
    Welcome to the Stata Forum / Statalist.

    Shall you share data, it gets easier to provide a better reply.

    That said, you may - encode region - before collapsing the data set.
    Best regards,

    Marcos

    Comment


    • #3
      The command is not just illegal; it's unlikely to be what you want at all. For once, I disagree with Marcos Almeida. An encode of region with nothing else said just maps regions in alphabetical order to integers 1 up. You can take a mean of that but to no useful end.

      It sounds as if your command should start

      Code:
      collapse
      and end

      Code:
      , by(region year)
      but you need to specify in between which variables you want to collapse.

      Comment


      • #4
        That's true, Nick. Now I can figure out the mistake. Thank you for the remark.

        Maybe the issue in #1 relates to something like:

        Code:
        . sysuse auto
        (1978 Automobile Data)
        
        . encode make, gen(Make)
        
        . collapse (count) make, by(foreign)
        type mismatch
        r(109);
        
        . collapse (count) Make, by(foreign)
        
        . list
        
             +-----------------+
             |  foreign   Make |
             |-----------------|
          1. | Domestic     52 |
          2. |  Foreign     22 |
             +-----------------+
        Best regards,

        Marcos

        Comment


        • #5
          In my case though there are 14 different categories of regions under the variable region and three different years under the variable year. So I don't think I could list ex. Western Europe, North America in between as the variable? Since the main variable would be region?








          Last edited by Jason Browen; 18 Oct 2019, 12:48.

          Comment


          • #6
            In my case though there are 14 different categories of regions under the variable region and three different years under the variable year. So I don't think I could list ex. Western Europe, North America in between as the variable? Since the main variable would be region? Here is a snapshot of my data from the dataset


            Click image for larger version

Name:	Screen Shot 2019-10-18 at 10.12.31 AM.jpg
Views:	1
Size:	35.6 KB
ID:	1521049
            Click image for larger version

Name:	Screen Shot 2019-10-18 at 11.31.40 AM.png
Views:	1
Size:	6.1 KB
ID:	1521050

            Comment


            • #7
              Nick's answer in post #3 is exactly what you want. You don't tell us anything about the data you have beyond the two variables region and year that you want to collapse by. So I'm going to pretend you have two more variables a and b, and in collapsing your data you want the totals of a and b for each combination of region and year.
              Code:
              . * Example generated by -dataex-. To install: ssc install dataex
              . clear
              
              . input str6 region float(year a b)
              
                      region       year          a          b
                1. "Mordor" 2015  5   3
                2. "Mordor" 2015  2   0
                3. "Mordor" 2016  1   1
                4. "Shire"  2015  5   6
                5. "Shire"  2016 20 160
                6. "Shire"  2016 22 506
                7. end
              
              . collapse (sum) a b, by(region year)
              
              . list, noobs
              
                +--------------------------+
                | region   year    a     b |
                |--------------------------|
                | Mordor   2015    7     3 |
                | Mordor   2016    1     1 |
                |  Shire   2015    5     6 |
                |  Shire   2016   42   666 |
                +--------------------------+

              Comment


              • #8
                Thank you so much for all of your help!

                Comment

                Working...
                X