Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I associate values in Stata with a list in Excel?

    Hello all,

    I have a dataset in Stata with a list of patients and their coded outcomes, and a list in excel describing what each coded outcome corresponds to.

    Example:

    Stata Dataset:
    ID outcome days_enoll visit
    001 X1 10 1
    002 X2 30 1
    003 X3 34 3
    004 X1 77 1
    005 X5 74 1
    006 X6 1 1
    007 X9 99 1
    Excel Spreadsheet:
    Outcome Code Corresponding Disease
    X1 Disease 1
    X2 Disease 2
    X3 Disease 3
    X4 Disease 4

    Is there an "automated" way to generate a new variable in stata such as outcome_defined and have it "search" the Excel Spreadsheet (either as an excel file or in Stata) and match the coded outcome with the corresponding disease. Matching the codes in the outcome columns to the disease state in the adjacent column.

    To get:
    ID outcome outcome_defined days_enoll visit
    001 X1 Disease 1 10 1
    002 X2 Disease 2 30 1
    003 X3 Disease 3 34 3
    004 X1 Disease 1 77 1
    005 X5 Disease 5 74 1
    006 X6 Disease 6 1 1
    007 X9 Disease 9 99 1
    I'm sure there is a better way to explain this very easily, but hopefully this makes sense.
    Thank you very much!

  • #2
    Yes. The first step is to import your spreadsheet into a Stata data set and save it. Stata has the -import excel- command for that. You must read -help import excel- to see the various options that you might apply. I cannot advise you about most of them because they depend on specifics of the spreadsheet itself. Once that is done:
    Code:
    use stata_dataset, clear
    rename outcome outcomecode
    merge m:1 outcomecode using dataset_imported_from_excel, keep(match master) nogenerate
    rename correspondingdisease outcome_defined
    Replace the italicized portions by the actual names of the data sets involved.

    Based on the example data tableaux you show, however, the results will not be exactly what you show at the end of your post. The last three rows of your output tableau will have only missing values for outcome_defined because the example data from your spreadsheet contain no information about outcomes X5, X6, or X9.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X