Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way to replicate the following graph?

    Dear all,

    I am interested in the following graph showing the relationship between education and child mortality worldwide, which is taken from "our world in data".

    My question is that is there a way to produce a graph that is similar (exactly would be perfect but similar would be nice) to the one above? I am not sure if providing a date example would be a good idea since one may need to whole data to make such a graph, thus I attached the full data below. If you are afraid of having virus in your computer, you can download the data from the original website: https://ourworldindata.org/child-mortality. Since the webpage is rather long, you may want to use "relationship between education and child mortality" keyword for a search (that would save your time).

    Data: https://drive.google.com/file/d/1Pv0...usp=share_link

    Thank you.
    Attached Files
    Last edited by Matthew Williams; 01 Jul 2023, 06:26.

  • #2
    Sorry, this post is just to push my thread up since it does not appear in the first page.

    Comment


    • #3
      This is just a scatter plot (well technically one per continent). Each plot dictates marker color, size and label. You can also use a log scale axis.

      Comment


      • #4
        Originally posted by Leonardo Guizzetti View Post
        This is just a scatter plot (well technically one per continent). Each plot dictates marker color, size and label. You can also use a log scale axis.
        Dear Leonardo,
        Thank you for your swift response. I understand that this is a scatter plot, however, I don't know how to produce such a graph. Could you please show me code example given the data I sent? Thank you.

        Comment


        • #5
          Sure, here's an example to get you started. I had to guess a bit about the data layout.

          I used Nick Cox's -mylabels- (Stata Journal) to make nicer y-axis labels. You can also do better with the marker labels to prevent the overlap, such as plotting fewer of them, or making them smaller.

          Code:
          import delimited using "Education and child mortality.csv", case(low) clear
          rename meanyear mean_edyrs
          rename childmortality childmort
          
          drop if year < 1950
          sort entity year
          bys entity (year) : replace mean_edyrs = cond(_n>1 & mi(mean_edyrs) & !mi(mean_edyrs[_n-1]), mean_edyrs[_n-1], mean_edyrs)
          
          keep if year==2015
          keep if !mi(continent)
          keep if !mi(childmort)
          keep if !mi(population)
          keep if !mi(mean_edyrs)
          
          encode continent, gen(cont)
          drop continent
          rename cont continent
          
          * to show labels on some of the entities
          gen entity_lab = entity
          replace entity_lab = "" if mod(_n, 2)==0
          
          * customize scaling of marker sizes
          gen mksize = log(population)
          
          mylabels 0.5 1 2 5 10 15 , local(myylabs) myscale(@) suffix("%")
          
          twoway (sc childmort mean_edyrs [aw=pop] if continent==1, mcol(stc1%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==1, mlab(entity_lab) msym(none) mlabcol(stc1)  ) || ///
                 (sc childmort mean_edyrs [aw=pop] if continent==2, mcol(stc2%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==2, mlab(entity_lab) msym(none) mlabcol(stc2)  ) || ///
                 (sc childmort mean_edyrs [aw=pop] if continent==3, mcol(stc3%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==3, mlab(entity_lab) msym(none) mlabcol(stc3)  ) || ///
                 (sc childmort mean_edyrs [aw=pop] if continent==4, mcol(stc4%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==4, mlab(entity_lab) msym(none) mlabcol(stc4)  ) || ///
                 (sc childmort mean_edyrs [aw=pop] if continent==5, mcol(stc5%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==5, mlab(entity_lab) msym(none) mlabcol(stc5)  ) || ///
                 (sc childmort mean_edyrs [aw=pop] if continent==6, mcol(stc6%30)  ) ||  ///
                 (sc childmort mean_edyrs if continent==6, mlab(entity_lab) msym(none) mlabcol(stc6)  ) ///
                 , legend(label(1 "Africa") label(3 "Asia") label(5 "Europe") ///
                          label(7 "North America") label(9 "Oceania") label(11 "South America") ///
                          order(1 3 5 7 9 11) col(1) pos(7) ring(0) ) ///
                 ysc(log) ylab(`myylabs') ///
                 yti("Child mortality") ///
                 xti("Average years of schooling of women in the reproductive age bracket (15 to 49 years)")
          The resulting graph
          Click image for larger version

Name:	childmortality.jpg
Views:	1
Size:	64.9 KB
ID:	1719045

          Comment


          • #6
            Dear Leonardo,

            Thank you so much for your wonderful help. Your code looks great!!!

            Comment

            Working...
            X