Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to do a graph like this one in Stata

    Dear Statalist,

    I am trying do replicate (with my own data) a graph like the one I show below. As you can see, the "y" variable (aggregated trust) is captured in three time periods for different regions, and it is ordered (in the horizontal axis, even though it is not shown) by the variable "x" which is inequality in 1980. This show a negative association between regions with higher inequality showing lower trust.
    Click image for larger version

Name:	graph.png
Views:	2
Size:	58.6 KB
ID:	1521016


    I have tried with the following command:
    Code:
    graph dot (asis) y if year==2008 | year==2011 | year==2014, over(year) over(region, sort(x1) label(ang(v)))  vertical linetype(line) lines(lc(none)) asyvars
    Click image for larger version

Name:	Graph_ex.png
Views:	1
Size:	21.8 KB
ID:	1521017


    As you can see, it miss how to connect the three years with a line, as well as the labels (number of the x-axis) within the graph, not in the x-axis. However, I would like to know if this can be done using something like
    Code:
    scatter y region , sort(x1)
    since I would also like to add a fitted line. My problem is that I do not know how to sort the independent variable (horizontal axis) by another variable, as you can imagine sort(x1) do not do the trick as in the graph dot example.

    I give you a dataex example of my data below, for if you can help me with this.
    Thanks in advance.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int year float region byte x1 float y
    2008 16  4 .44444445
    2011 16  8 .38235295
    2014 16  3  .3583333
    2008  7  1  .3027523
    2011  7  0 .24489796
    2014  7  0 .33783785
    2008 10  2 .27622378
    2011 10  6  .3067227
    2014 10  3  .3596059
    2008  1  4 .24615385
    2011  1  5   .221519
    2014  1  7 .29508197
    2008 11  0  .3548387
    2011 11  0  .3529412
    2014 11  1 .44444445
    2008  4  0      .125
    2011  4  0 .08695652
    2014  4  0 .05882353
    2008  8  0  .3565217
    2011  8  1 .29357797
    2014  8  0  .3762376
    2008  9  7  .4247312
    2011  9 16   .369509
    2014  9 27  .4058824
    2008  2  1  .3563218
    2011  2  1 .37096775
    2014  2  3  .3928571
    2008 13 17  .2490566
    2011 13 27  .3076923
    2014 13 11  .3154762
    2008 15  1        .4
    2011 15  1        .4
    2014 15  1 .29090908
    2008 12  1      .392
    2011 12  1  .3082707
    2014 12  1  .3047619
    2008  3  1  .4042553
    2011  3  3       .32
    2014  3  0  .3243243
    2008 17  0  .3809524
    2011 17  0       .25
    2014 17  0        .3
    2008  6  0  .4615385
    2011  6  3  .3913043
    2014  6  1        .5
    2008 14  0  .3617021
    2011 14  1  .2857143
    2014 14  1 .24390244
    2008  5  2  .3636364
    2011  5  2  .1724138
    2014  5  0 .14285715
    end


  • #2
    Thanks for the data example. Here are some ideas:

    Code:
    reshape wide x1 y, i(region) j(year) 
    sort x12014 
    gen x = _n 
    gen max = max(y2008, y2011, y2014) 
    gen min = min(y2008, y2011, y2014) 
    gen max2 = max + 0.01
    
    set scheme s1color 
    
    twoway rspike max min x || scatter max2 x, ms(none) mla(region) mlabpos(12) ///
    || scatter y* x, ms(Oh + Th) legend(order(3 "2008" 4 "2011" 5 "2014") ring(0) pos(5) col(1)) ///
    ytitle(Aggregate trust) yla(0 "0" 0.1(0.1)0.5, format(%02.1f) ang(h)) ysc(r(0 0.55)) xsc(r(0.5 17.5)) xla(none) xtitle("")
    Click image for larger version

Name:	trust.png
Views:	1
Size:	27.2 KB
ID:	1521022

    Comment


    • #3
      Dear Nick, thanks for your help. I didn't think that going from long to wide could solve it. This is an intelligent solution.

      Comment


      • #4
        It could be done without a reshape.


        Code:
        gen x12014 = total((year == 2014) * x1), by(region) 
        sort x12014 region 
        egen x = seq(), block(3) 
        bysort region (y): gen max = y[_N] 
        by region : gen min = y[1] 
        gen max2 = max + 0.01 
        set scheme s1color 
        
        separate y, by(year) veryshortlabel 
        
        twoway rspike max min x || scatter max2 x, ms(none) mla(region) mlabpos(12) ///
        || scatter y???? x, ms(Oh + Th) legend(order(3 "2008" 4 "2011" 5 "2014") ring(0) pos(5) col(1)) ///
        ytitle(Aggregate trust) yla(0 "0" 0.1(0.1)0.5, format(%02.1f) ang(h)) ysc(r(0 0.55)) xsc(r(0.5 17.5)) xla(none) xtitle("")
        Note that because there are ties on x1 there will be several different orderings.

        Comment


        • #5
          Dear Nick, thanks again. I think this one is not so straightforward as the first code you show. Just for clarification, should the first line be "egen" instead of "gen"?

          Comment


          • #6
            Correct. That e got missed out in the copy and paste.

            Comment

            Working...
            X