Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unique IDs by month

    Hi everyone,

    I have three variables of my interest for my microdata: consumer IDs, month, card type. I would like to calculate some summary statistics. In particular, first I want to find number of observations per month. This is the code to realize this:
    gen count=1
    collapse (sum) count, by(txn_yrmon)

    Second, I want to find the number of consumers per month. So basically I want to find the number of unique consumer IDs in each month. Third, I want to find the number of consumers per month by card type.

    I'm still learning how to approach this process. Could you please help me on last two questions? Thanks in advance!

  • #2
    Code:
    by txn_yrmon (consumer_id), sort: gen n_consumers = sum(consumer_id != consumer_id[_n-1])
    by txn_yrmon (consumer_id): replace n_consumers = n_consumers[_N]
    
    by txn_yrmon card_type (consumer_id), sort: gen n_consumers_card_type = sum(consumer_id != consumer_id[_n-1])
    by txn_yrmon card_type (consumer_id): replace n_consumers_card_type = n_consumers_card_type[_N]
    Note: because no example data was given, this code is untested. In the future, when asking for help with code, please provide example data, and use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    By the way, the original problem of number of observations per month can be done more simply:

    Code:
    by txn_yrmon, sort: gen n_obs = _N

    Comment


    • #3
      Thank you so much, Clyde! Yeah this is my first post and I will keep in mind about that in the future. Thanks.

      Comment

      Working...
      X