Hi,
I have two datasets that I have merged together. Dataset 1 includes personal data, including city and year and month when born. I have another dataset nr 2, that includes air pollution levels per month and year and city. I merged the two datasets on month and year.
The issue is that the dataset nr 2 includes the month and year variables as columns and then each city as a separate column (where the values for each city-column are the air pollution levels for the city that year and month).
However in dataset nr 1, the birthcity is a column with the city names as values.
I attempted using the following code to create another variable:
Where I created one variable for the pollution per city by grouping together the cities, essentially creating the same column as the one called birthcity. But, then I replace the city name by the air pollution level for that month and year.
However, the code seems wrong. And I wondered if anyone knows a simpler way to approach this problem? In the end I would like to create a dummy for each city to explore the within-variation in pollution in each city.
Kind regards.
I have two datasets that I have merged together. Dataset 1 includes personal data, including city and year and month when born. I have another dataset nr 2, that includes air pollution levels per month and year and city. I merged the two datasets on month and year.
The issue is that the dataset nr 2 includes the month and year variables as columns and then each city as a separate column (where the values for each city-column are the air pollution levels for the city that year and month).
However in dataset nr 1, the birthcity is a column with the city names as values.
I attempted using the following code to create another variable:
Where I created one variable for the pollution per city by grouping together the cities, essentially creating the same column as the one called birthcity. But, then I replace the city name by the air pollution level for that month and year.
Code:
egen pollution_city_value = group(birthcity) replace pollution_city_value = city1 if birthcity== 1 replace pollution_city_value = city2 if birthcity == 2 replace pollution_city_value = city3 if birthcity == 3 replace pollution_city_value = city4 if birthcity == 4
Kind regards.
Comment