Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with calculations based on conditions and groups

    Hello all,

    I have a dataset that looks something like this (but with over 70k observations and over 200 firms)

    FirmID Area dummy
    1 10 1
    1 10 1
    1 11 1
    1 11 1
    1 12 0
    2 10 1
    2 11 0
    3 10 1
    3 11 1
    3 11 1
    3 12 1
    3 12 1
    3 13 0


    Each row represents a customer of the firm in that area. And the dummy reflects whether this area is part of the firm's market or not. Now using this data, I want to calculate the total number of customers (including those from other firms) belonging to all areas that constitute a market for a firm, which I am able to do using -by sort- and -egen-. However, after that, or each firm, I want to identify the firms for the customers (and the corresponding number of customers from these firms) in that firm's market. I'm wondering what's a quick way to do this in Stata/

    Thank you in advance for any ideas here. And I'm happy to answer any questions if my ask isn't clear!

  • #2
    I'm not entirely sure I understand correctly what you are asking for, but I think it is this:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(firmid area dummy)
    1 10 1
    1 10 1
    1 11 1
    1 11 1
    1 12 0
    2 10 1
    2 11 0
    3 10 1
    3 11 1
    3 11 1
    3 12 1
    3 12 1
    3 13 0
    end
    
    //  CREATE A DATA SET CONTAINING ONE OBS PER FIRM AND AREA, WITH # OF CUSTOMERS
    preserve
    collapse (sum) n_customers = dummy, by(firmid area)
    rename firmid other_firmid
    tempfile firm_areas
    save `firm_areas'
    restore
    
    drop dummy
    duplicates drop
    joinby area using `firm_areas', unmatched(master)
    keep if other_firmid != firmid
    drop _merge
    If that's not it, please provide a more detailed explanation. Perhaps best would be to show what the results for your example data should look like, and explain in detail how you arrived at them.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X