Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining two datasets on two variables.

    Hello,
    I have one dataset (called Natural Disaster) with three variables: Year, CountryCode, and TotalAffected. Since countries can have more than one natural disaster every year, there are often multiple entries with the same Year and CountryCode
    I have one dataset (called Population) with many variables: one named CountryCode, and the rest corresponding to years (from 1980-2018). So each row of this dataset contains the country's code, and that country's population over the period 1980-2018.
    I want to create a new variable, PercentageAffected, in the Natural Disaster dataset. This is equal to (TotalAffected/Population in that year and in that country).
    To do that I would have to create a Population variable in the Natural Disaster dataset.
    I'm not sure how to transfer the population data from my Population dataset into the Natural Disaster dataset, such that the new Population variable in the Natural Disaster dataset contains the value of the population corresponding to the correct CountryCode and Year.
    I would appreciate help with this, and please let me know if this question was unclear. Thanks!

  • #2
    It would be something along the lines of the following.
    Code:
    use <Population>, clear
    merge 1:m CountryCode Year using <Natural Disaster>, keep(match using)
    generate double PercentageAffected = 100 * TotalAffected / Population
    with the actual names of the Population and Natural Disaster datasets substituted in for the stuff in the angle brackets.

    Comment


    • #3
      Since there is no data example i am assuming, based on the description, that the first dataset is in long format and the second dataset is in the wide format. You should first reshape the second dataset from wide to long and then merge it with the first dataset using CountryCode and Year for uniquely identifying the country and year combination. Please look for
      Code:
      help reshape
      Last edited by Tarun Choudhary; 07 May 2021, 05:31.

      Comment


      • #4
        Originally posted by Elizabeth Leong View Post
        with many variables: one named CountryCode, and the rest corresponding to years (from 1980-2018)
        Yeah, I missed that, sorry.

        Comment


        • #5
          Originally posted by Tarun Choudhary View Post
          Since there is no data example i am assuming, based on the description, that the first dataset is in long format and the second dataset is in the wide format. You should first reshape the second dataset from wide to long and then merge it with the first dataset based using CountryCode and Year for uniquely identifying the country and year combination. Please look for
          Code:
          help reshape
          Yes, this is what I was missing. I was searching for how-tos on merging/joining instead of changing my dataset first. Thank you!

          Comment

          Working...
          X