Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • One row equal to multiple ID?

    Hi everyone, I have been using STATA with datasets where each row corresponds to one person/ID, in particular women. Then, for each woman I had several variables such as date of a test and result of that test, namely HPV positive/negative. Now I will have to work on a new dataset (coming as Excel file) where HPV-positive women are one row=one woman, but HPV-negative women are aggregated in a single row (one row=multiple women). Is there a way to solve this when I import the Excel file and deal with aggregated data on STATA without having to learn R-studio? Thank you.

  • #2
    Welcome to Statalist.

    Without a better idea of the organization of your data, it is not possible to give concrete advice. But here is an overview of how I would approach the problem, based on my guesses about what your data look like..

    What I would do is duplicate the Excel worksheet twice. And then do nothing using the original worksheet, which will be your backup in case something goes wrong in the next step. So then you'll have three tabs in your Excel workbook -- the original and the two copies.

    In the first copy, I would delete the data for the HPV-negative women. In the second copy, I would delete the data for the HPV-positive women.

    Then I would use import excel twice: once to import the data for HPV-positive women from the first copy, and again to import the data for HPV-negative women from the second copy.

    Then I would do whatever is necessary to change the HPV-negative data to one observation per woman (almost certainly using the reshape long command as part of the process) and then use the append command to create a dataset with the all the observations.

    Comment

    Working...
    X