Third over() variable in graph dot

Anup Tyagi

Join Date: Sep 2024

Posts: 14
#16

04 Sep 2024, 09:20

Originally posted by Nick Cox View Post

#8 misses the point. If a value were exactly equal to the 50th percentile or the 90th percentile would it go up or down? Perhaps the issue doesn't arise -- if the percentile isn't a value in the data -- or is trivial, but in principle bin limits should be unambiguous. See for example Section 5 of https://journals.sagepub.com/doi/pdf...867X1801800311 and the references given there. You need only an explanation such as "Lower limits are inclusive" somewhere.

#9 scheme(s1mono) is one to try as in your original post.

#10 I suggest that two essentials are that three marker symbols are of equal size and equal visual impact. If they might ever occlude each other, you need open or hollow symbols such as Oh or Th and + works well with any of those,

The most striking detail is that Village 3 is different. You can judge graph designs partly on how clearly they make that point.

I am glad you noticed that Village 3 is different. Thanks for the suggestions.
Comment
Anup Tyagi

Join Date: Sep 2024

Posts: 14
#17

04 Mar 2025, 02:15

Nick, the tabplot is sorting Population categories in alphabetical order. I would like them not being sorted, or sorted by another variable, SeqCode. Is there a way to achieve this? This is the commad:

tabplot year which [iw=_], horizontal by(population, note("") t1title(Income shares) col(1)) ytitle("") showval(offset(0.27) mlabsize(medsmall) format(%2.1f)) subtitle(, nobox nobexpand fcolor(none) pos(9)) separate(which) xsc(r(0.8 .))
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#18

04 Mar 2025, 02:29

See https://journals.sagepub.com/doi/pdf...1536867X211045 for quite detailed discussion of this issue.

It's likely that you need go no further than Section 2 in the paper.

That said, seqCode in your data example, if that is what you mean, wouldn't so far as I can see imply different graphs.

Last edited by Nick Cox; 04 Mar 2025, 02:31.
Comment

Anup Tyagi

Join Date: Sep 2024
Posts: 14

#19

04 Mar 2025, 03:23

The paper is not available on the website.

The example above is something I had made up. The actual data is below. It has income shares. Here I would not like the tabplot "population" to be alphabetically sorted.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte seqCode str12 popgroup str21 population str7 year float(_50 _90 _100)
10 "Social Group" "Sikh, Jain, Christian" "2004-05" 12.63 40.62 46.76
 9 "Social Group" "Muslim"                "2004-05" 17.46 44.01 38.53
 8 "Social Group" "Adivasi"               "2004-05" 15.68 41.57 42.76
 7 "Social Group" "Dalit"                 "2004-05" 19.71 44.56 35.73
 6 "Social Group" "OBC"                   "2004-05" 14.85 44.24 40.91
 5 "Social Group" "Forward Caste"         "2004-05"  13.7 44.78 41.52
 4 "Social Group" "Brahmin"               "2004-05" 12.87 47.06 40.07
 3 "India"        "India Urban"           "2004-05" 17.52 45.14 37.34
 2 "India"        "India Rural"           "2004-05" 16.28 42.59 41.12
 1 "India"        "India"                 "2004-05" 14.25 42.53 43.22
10 "Social Group" "Sikh, Jain, Christian" "2011-12" 14.05 41.61 44.33
 9 "Social Group" "Muslim"                "2011-12" 18.34 42.85 38.81
 8 "Social Group" "Adivasi"               "2011-12" 15.31 41.53 43.16
 7 "Social Group" "Dalit"                 "2011-12" 18.22 44.12 37.65
 6 "Social Group" "OBC"                   "2011-12" 15.46 43.43 41.11
 5 "Social Group" "Forward Caste"         "2011-12" 13.79 45.33 40.88
 4 "Social Group" "Brahmin"               "2011-12" 13.48 45.89 40.63
 3 "India"        "India Urban"           "2011-12" 17.16 44.39 38.45
 2 "India"        "India Rural"           "2011-12" 15.73 42.29 41.98
 1 "India"        "India"                 "2011-12" 14.76 42.62 42.62
end

Comment

Anup Tyagi

Join Date: Sep 2024

Posts: 14
#20

04 Mar 2025, 03:25

I ran this Stata code:

reshape long _, i(population year) j(which)
label def which 50 "0-50" 90 "50-90" 100 "90-100"
label val which which
label var which "some explanation here"

set scheme s1mono
tabplot year which [iw=_], horizontal by(population, note("") t1title(Income shares) col(1)) ytitle("") showval(offset(0.27) mlabsize(medsmall) format(%2.1f)) subtitle(, nobox nobexpand fcolor(none) pos(9)) separate(which) xsc(r(0.8 .))
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35698
#21

04 Mar 2025, 03:49

The link in #18 was working when I posted it. Here's another way in: https://journals.sagepub.com/doi/pdf...6867X211045582

Your question is a bit clearer now. You don't want alphabetical sorting by population -- but you don't say what ordering you do want.

So at this point my best advice is to define value labels and then use encode to map population to a desired order.
Comment
Anup Tyagi

Join Date: Sep 2024

Posts: 14
#22

04 Mar 2025, 03:54

I want them sorted by "seqCode", or in the order they are in the dataset, starting from "India", then "Indian Rural", etc.

Thanks for the paper. I can see it now.
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35698

#23

04 Mar 2025, 06:32

I think the main step needed is that the values of population need to be the value labels of seqCode. You can do that the slow way

Code:

label def seqCode 1 "India"

and so on, or you can use labmask from the Stata Journal, as here. This works. I made some cosmetic changes to the graph.

Code:

 Example generated by -dataex-. For more info, type help dataex
clear
input byte seqCode str12 popgroup str21 population str7 year float(_50 _90 _100)
10 "Social Group" "Sikh, Jain, Christian" "2004-05" 12.63 40.62 46.76
 9 "Social Group" "Muslim"                "2004-05" 17.46 44.01 38.53
 8 "Social Group" "Adivasi"               "2004-05" 15.68 41.57 42.76
 7 "Social Group" "Dalit"                 "2004-05" 19.71 44.56 35.73
 6 "Social Group" "OBC"                   "2004-05" 14.85 44.24 40.91
 5 "Social Group" "Forward Caste"         "2004-05"  13.7 44.78 41.52
 4 "Social Group" "Brahmin"               "2004-05" 12.87 47.06 40.07
 3 "India"        "India Urban"           "2004-05" 17.52 45.14 37.34
 2 "India"        "India Rural"           "2004-05" 16.28 42.59 41.12
 1 "India"        "India"                 "2004-05" 14.25 42.53 43.22
10 "Social Group" "Sikh, Jain, Christian" "2011-12" 14.05 41.61 44.33
 9 "Social Group" "Muslim"                "2011-12" 18.34 42.85 38.81
 8 "Social Group" "Adivasi"               "2011-12" 15.31 41.53 43.16
 7 "Social Group" "Dalit"                 "2011-12" 18.22 44.12 37.65
 6 "Social Group" "OBC"                   "2011-12" 15.46 43.43 41.11
 5 "Social Group" "Forward Caste"         "2011-12" 13.79 45.33 40.88
 4 "Social Group" "Brahmin"               "2011-12" 13.48 45.89 40.63
 3 "India"        "India Urban"           "2011-12" 17.16 44.39 38.45
 2 "India"        "India Rural"           "2011-12" 15.73 42.29 41.98
 1 "India"        "India"                 "2011-12" 14.76 42.62 42.62
end

labmask seqCode, values(population)

reshape long _, i(seqCode year) j(which)
label def which 50 "0-50" 90 "50-90" 100 "90-100"
label val which which
label var which "some explanation here"

set scheme s1mono
tabplot year which [iw=_], horizontal by(seqCode, note("") compact t1title(Income shares) col(1)) ytitle("") showval(offset(0.5) mlabsize(medsmall) format(%2.1f)) subtitle(, nobox nobexpand fcolor(none) pos(9)) separate(which) xsc(r(0.8 .))

Comment

Anup Tyagi

Join Date: Sep 2024

Posts: 14
#24

04 Mar 2025, 20:01

Thanks, Nick. That works.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment