Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aggregating Data

    Hello-

    I have tried to find resources on aggregating data in stata (or if its even possible). In sum, I have data for crime incidents; all shooting crimes that occurred on a given day from 2016-2025. I am attempting to run time-series analyses on these data but believe the data need to be in aggregated form.

    For example, lets say there are 45 cases that have a january 2016 date. these 45 cases have 3 different crime types (robbery, assault, and homicide). I want to aggregate the data so that these 45 cases are just one case, 2016m1. And each case has a count of how many robberies occurred in january 2016 (same for the other two categories).

    The only thing I can think of is to add them all up individually and make a new dataset in excel to upload. But this is obviously time consuming. Is there a way to do this in stata with the original data or is this my only option?

  • #2
    It sounds you want something like:
    Code:
    collapse (count) n_cases = case_id, by(mdate crime_type)
    Since you didn't provide example data, you will need to adapt this code to the actual structure of your Stata dataset. In this command you will need to replace case_id, mdate, and crime_type by the actual names of the variables that identify individual cases, show the year-month date, and indicate which of the three types of crime the case is, respectively. It is possible that you will first have to create one or more of those variables, depending on how your data is set up.

    In the future, when asking for help with code, please use the -dataex- command and show example data. Although sometimes, as here, it is possible to give an answer that has a reasonable probability of being correct, this is usually not the case. Moreover, such answers are necessarily based on experience-based guesses or intuitions about the nature of your data. When those guesses are wrong, both you and the person trying to help you have wasted their time as you end up with useless code. To avoid this, a -dataex- based example provides all of the information needed to develop and test a solution. If you are running version 16 or later, or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    More generally, please read the Forum FAQ for excellent guidance on how to get the most from your Statalist experience.

    Comment


    • #3
      Thank you for your response. I was just looking for general guidance on whether this would be possible. The only resources I could find used continuous data examples. I was unsure if this was a possibility with categorical data.

      Comment


      • #4
        Code:
        help contract
        points to an even more relevant command, not that using collapse is problematic here.

        The higher level problem of how to find out has many answers, and one is to go the Data Management manuals and skim the list of commands at the beginning.

        The manual is bundled with your Stata under PDF documentation or can be accessed at https://www.stata.com/manuals/d.pdf

        Comment

        Working...
        X