Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collapse by two variables

    Hi all. Say I have a data set of date of births from 2010-2016. Each entry is a birth comprised of a lot of variables like date of birth year, date of birth month, illness, birthweight, and so on. I was able to collapse the data set by year by generating a count variable that adds up the number of births within a year, then collapse by(year). However, now I want the data to look like so that there are 3 variables: date of birth year, date of birth month, total births. So first column would have 12 entries of 2010, then 12 entries of 2011, and so on. The second column would be Jan, Feb,.... The third column is the number of births in the month and year in the same row. How would you do that? Thank you.

  • #2
    So you want to do something like
    Code:
    gen long obs_no = _n
    collapse (count) n_births = obs_no, by(birth_year birth_month)
    Obviously you need to tailor that to the actual variable names in your data set. Had you posted example data, custom code, tested, could be provided. Since you left that to the reader's imagination, you get back code tailored to imaginary data.

    That said, I also suggest you change your data organization. For most purposes, you will be better off having a single variable that combines the year and month of birth into a single Stata internal format monthly date variable. How you do that will depend on whether your year and month variables are numeric or string, so, again, without example data, specific advice cannot be given. But if you had a Stata internal format monthly variable for the year and month of birth, then you would just -collapse- with that one variable in the -by()- option.

    In the future, show data examples when you want help with code, and use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.


    Comment


    • #3
      Hi all,

      I am working on survey and I need to make my estimation by two criteria (sexe and socioprofessionnal category). could you please help with the command.
      Thanks a lot,
      N.K

      Comment

      Working...
      X