Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a histogram with colors by a certain variable

    Dear Statalisters,

    I hope you are all fine. I'm beginning with descriptive statistics and I'd like to have a particular histogram for my project, to describe my results in a more visual way than in a table. Please find my data below :

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(country percent) long ind
    1  .7591934 1
    3 4.5177755 2
    2  4.839139 3
    3 2.1726158 4
    4  2.386139 5
    3  .4601172 6
    5  31.46051 7
    5  49.2879 8
    1  3.964197 9
    2 1.5125806 1
    2  7.757139 2
    3 3.7730794 3
    1  .1026992 4
    4 1.0024335 5
    1 .25116652 6
    1  74.40892 7
    4 4.4953227 8
    3  6.351722 9
    3  2.008643 1
    1  6.153969 2
    5  3.598485 3
    5 2.1380177 4
    2  60.14992 5
    3  .0933632 6
    1 12.157224 7
    end
    As you can see, country is an identifier and each number represent a country. percent is the variable I'd like to have on my y-axis. I know histograms usually use frequencies or densities but is there a way I can plot the percentage for each industry denoted by the variable ind? In addition, I'd like to have different colors for each country denoted by the variable country. What I would like should look like something like this:



    Each color should represent one country and each bar within each color should be the percentage of industry x of country y.

    Click image for larger version

Name:	histogram.png
Views:	1
Size:	5.1 KB
ID:	1671335

    I hope I was clear enough. Thanks a lot for the future advices.

    Best regards,

    Hugo
    Last edited by Hugo Denis; 28 Jun 2022, 12:29.

  • #2
    You have duplicates on country ind in the example dataset.

    I think your one-way layout won't work nearly as well as your graph example.

    Here is an alternative using tabplot from the Stata Journal. I've just summed over duplicates.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(country percent) long ind
    1  .7591934 1
    3 4.5177755 2
    2  4.839139 3
    3 2.1726158 4
    4  2.386139 5
    3  .4601172 6
    5  31.46051 7
    5  49.2879 8
    1  3.964197 9
    2 1.5125806 1
    2  7.757139 2
    3 3.7730794 3
    1  .1026992 4
    4 1.0024335 5
    1 .25116652 6
    1  74.40892 7
    4 4.4953227 8
    3  6.351722 9
    3  2.008643 1
    1  6.153969 2
    5  3.598485 3
    5 2.1380177 4
    2  60.14992 5
    3  .0933632 6
    1 12.157224 7
    end
    
    duplicates list country ind , sepby(country ind)
    
    tabplot country ind [iw=percent], scheme(s1color) showval(format(%3.2f)) separate(country)

    Click image for larger version

Name:	denis.png
Views:	1
Size:	19.4 KB
ID:	1671372


    You naturally may swap axes, change the display format or change the colours. Note that red and green together is not a good idea.

    Comment

    Working...
    X