Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matched dataset based on multiple variables

    Hello,

    I am attempting to create a dataset of matched observations based on multiple criteria.

    I have a dataset of 5 digit zipcode areas with census variable estimates (median property value, population, unemployment rate, etc.).

    A small portion of these zipcode areas are "treated" the rest are "untreated".

    I would like to match my "treated" observations to very similar "untreated" observations, based on certain criteria (a single match 1:1 at first and more if necessary).

    For example, I would like to match an "untreated" observation to a "treated" observation as long as it is within 1000$ (+/-) of its median property value, within 15%(+/-) of its unemployment rate, and within the same state of that "treated" observation.

    I cant seem to find a canned function that can do multiple criterion as described.

    Thank you,
    Jarron V

  • #2
    As far as I know, no such canned command has been written for Stata. It would be very difficult to do that well because the intermediate data sets generated in the course of such matching can be massive, and various tricks, that rely on knowledge about the data itself, are often needed to prevent the whole thing from overwhelming available memory (and also taking forever to run). Since there are over 40,000 zipcodes in the United States, even if your data set is not comprehensive, it is probably large enough to run into such difficulties if handled "generically."

    The problem of matching with restrictions of the kind you describe (known as caliper matching) comes up often on Statalist. You can search the Forum and find many examples. If you are comfortable with your programming skills, you can mark up one of those examples to suit your data set. Alternatively, post back and use the -dataex- command to show example data. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    In addition, if you post back, clarify whether you want matching with or without replacement.

    Comment


    • #3
      the question is not completely clear to me but I think there are user-written commands that may help, in particular, -ultimatch- and -kmatch- but it is hard to tell because there is a lot of detail not included in the query (e.g., do you want matching with, or without, replacement?); use -search- or -findit- to locate and install

      Comment

      Working...
      X