Dear Statalists,
I am trying to plot how the population ranks of a number of cities in the U.S. changed over time.
The following is an extract of my data that clarifies the problem. The full data set consists of more than two cities but the problem can be reproduced using only two cities.
This is my state code: Essentially, it draws a line for each city and how its population rank developed over time. The lines are in red if the city is in the 20 most populated city and grey if its in the top 50.
And this is the resulting graph:
The somewhat obvious problem is that my code plots the line for Memphis (at the top) as well as the line for New York (at the bottom) but because New York directly follows Memphis in the alphabet (in my data set), there is also some connection to the line.
Either there is something I can write differently in my code, or I need to reshape the data in some way before (or both) such that stata realizes that the lines for Memphis and the lines for New York are (or should be!) separate entities. If anyone could give me a hint as to how I can tackle the problem, I would appreciate it a lot.
Thank you for reading my post.
Best wishes,
Milan
I am trying to plot how the population ranks of a number of cities in the U.S. changed over time.
The following is an extract of my data that clarifies the problem. The full data set consists of more than two cities but the problem can be reproduced using only two cities.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int(city year) float(city_rank_ top20_1950 top20_2010 top20_always) 4010 1950 26 0 1 0 4010 1960 22 0 1 0 4010 1970 17 0 1 0 4010 1980 14 0 1 0 4010 1990 18 0 1 0 4010 2000 18 0 1 0 4010 2010 20 0 1 0 4610 1950 1 1 1 1 4610 1960 1 1 1 1 4610 1970 1 1 1 1 4610 1980 1 1 1 1 4610 1990 1 1 1 1 4610 2000 1 1 1 1 4610 2010 1 1 1 1 end label values city city_lbl label def city_lbl 4010 "Memphis, TN", modify label def city_lbl 4610 "New York, NY", modify
Code:
twoway line city_rank_ year if top20_1950==0, mlabel(city) msize(0) lcolor(gray) || line city_rank_ year if top20_1950==1, mlabel(city) msize(0) lcolor(red) || scatter city_rank_ year if top20_1950==1 & year==1950, mlabel(city) msize(0) mlabp(9) mlabs(2) mlabc(black) || scatter city_rank_ year if top20_1950==1 & top20_always==1 & year==1950, mlabel(city) msize(0) mlabp(9) mlabs(2) mlabc(red) legend(off)
The somewhat obvious problem is that my code plots the line for Memphis (at the top) as well as the line for New York (at the bottom) but because New York directly follows Memphis in the alphabet (in my data set), there is also some connection to the line.
Either there is something I can write differently in my code, or I need to reshape the data in some way before (or both) such that stata realizes that the lines for Memphis and the lines for New York are (or should be!) separate entities. If anyone could give me a hint as to how I can tackle the problem, I would appreciate it a lot.
Thank you for reading my post.
Best wishes,
Milan
Comment