add a stacked bar chart and overlaying a line graph

Gabriel Temesgen

Join Date: Jul 2017
Posts: 73

add a stacked bar chart and overlaying a line graph

13 Mar 2025, 01:47

Dear all,
I want add a stacked bar chart and a line chart together. However, since the percentage values of the variables in the stack bar are close, it hides one variable and can't display in the barchart (hh consumption annual change- is not visible).
I put my code below. can you please help me in fixing my code. I want to replicate the graph attached below

stata code:
tsset time
graph twoway ///
(bar cons_gr time, pstyle(p4bar) barwidth(0.6)) ///
(bar gov_spend_gr time, pstyle(p3bar) barwidth(0.6)) ///
(bar gross_capital_formation_gr time, pstyle(p2bar) barwidth(0.6)) ///
(bar net_export_gr time, pstyle(p1bar) barwidth(0.6)) ///
(line gdp_gr time, pstyle(p6dot))

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int time float(gross_capital_formation_gr gov_spend_gr cons_gr net_export_gr gdp_gr)
2000          .          .          .          .         .
2001   -4.57481  2.1087377  1.0260786  1.0260786  1.875098
2002   .3454278  1.3808725  1.6651844  1.6651844 2.9992554
2003  3.6846826  .59398466  -.5998673  -.5998673  1.806385
2004  4.4764614  -2.406074 -1.7193335 -1.7193335  3.092364
2005   5.841976 -1.7591026  -.7534954  -.7534954 3.2104545
2006   4.258789   .9061334   .2080552   .2080552  2.637944
2007  1.3221928 .009202804    .568028    .568028 2.0499046
2008   .6885871  2.4357965  -.4875244  -.4875244  .9954063
2009  -8.777685  11.385446   5.713411   5.713411 -2.915086
2010   6.933267 -2.2268941   -.997973   -.997973 3.0908065
2011  2.8509014 -1.5347315  -1.922189  -1.922189 3.1371944
2012   2.964998  -.4162595   .2103034   .2103034 1.7556614
2013  .16138093  -1.678759 -.10612205 -.10612205 2.3258135
2014 -.14841795 -2.1144104 -.29404652 -.29404652  2.873467
2015 -4.2182727   3.016126  3.6071465  3.6071465   .649971
2016 -4.4517307   .8152912  1.1682287  1.1682287  1.038551
2017   3.467971  -1.637724  -.8687419  -.8687419  3.033835
2018  -.7352638 -.10577374 -.15402387 -.15402387  2.742963
2019   -1.43253 -.08672807 -.22868852 -.22868852  1.908432
2020 -1.5957807   9.860924 -1.6115885 -1.6115885 -5.038233
2021   7.231438  -5.544974  -4.586067  -4.586067  5.286957
2022  4.3162904 -3.5007775  -.7903162  -.7903162 3.8198664
2023  -5.298913   2.488734  2.5555406  2.5555406 1.2489274
end

Attached Files

Graph_bar_line.gph (9.5 KB, 1 view)

Tags: graph, graphics, Suggestion, syntax, Time Series

Nick Cox

Join Date: Mar 2014

Posts: 35709
#2

13 Mar 2025, 01:55

I don't follow what the question is here.

1. Your data example is yearly but the graph shows quarterly data.

2. You're asking about stacked bars, but the code you show has each bar (by default) starting at zero, so bars would in general be superimposed not stacked.

3. Otherwise the problem you report is that some values are too small to show up well. That is a characteristic of the data and not a coding problem to be fixed.
Comment
Gabriel Temesgen

Join Date: Jul 2017

Posts: 73
#3

13 Mar 2025, 02:22

Dear Nick,
my graph is here. one of the components of the stacked bar is not visible. I understand my data is yearly. how can i see all the components of the stacked bar?
Best
Gabriel
Attached Files
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35709

13 Mar 2025, 03:21

Your graph still doesn't correspond to the data example. The legend implies a variable Households_consumption_Growth which isn't in the data example.

But thanks for the extra details, which make your question clearer.

As said, your code doesn't stack bars at all. You are superimposing bars As the household variable is plotted first, its value would show as the far end of a bar if and only if it exceeded all the other variables, which evidently doesn't happen.

Although the design exemplified in #1 is quite popular, it seems to me far too hard to decode easily and effectively. It's natural economically but disconcertingly graphically that components are sometimes negative and sometimes positive.

I would do something quite different. I am not an economist and may have decoded your variable names incorrectly, but no one should want ugly variable names like that to appear on a graph.

Otherwise the design you ask for is programmable with twoway, and I think it's been done before, but I will stop there for now.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input int time float(gross_capital_formation_gr gov_spend_gr cons_gr net_export_gr gdp_gr)
2000          .          .          .          .         .
2001   -4.57481  2.1087377  1.0260786  1.0260786  1.875098
2002   .3454278  1.3808725  1.6651844  1.6651844 2.9992554
2003  3.6846826  .59398466  -.5998673  -.5998673  1.806385
2004  4.4764614  -2.406074 -1.7193335 -1.7193335  3.092364
2005   5.841976 -1.7591026  -.7534954  -.7534954 3.2104545
2006   4.258789   .9061334   .2080552   .2080552  2.637944
2007  1.3221928 .009202804    .568028    .568028 2.0499046
2008   .6885871  2.4357965  -.4875244  -.4875244  .9954063
2009  -8.777685  11.385446   5.713411   5.713411 -2.915086
2010   6.933267 -2.2268941   -.997973   -.997973 3.0908065
2011  2.8509014 -1.5347315  -1.922189  -1.922189 3.1371944
2012   2.964998  -.4162595   .2103034   .2103034 1.7556614
2013  .16138093  -1.678759 -.10612205 -.10612205 2.3258135
2014 -.14841795 -2.1144104 -.29404652 -.29404652  2.873467
2015 -4.2182727   3.016126  3.6071465  3.6071465   .649971
2016 -4.4517307   .8152912  1.1682287  1.1682287  1.038551
2017   3.467971  -1.637724  -.8687419  -.8687419  3.033835
2018  -.7352638 -.10577374 -.15402387 -.15402387  2.742963
2019   -1.43253 -.08672807 -.22868852 -.22868852  1.908432
2020 -1.5957807   9.860924 -1.6115885 -1.6115885 -5.038233
2021   7.231438  -5.544974  -4.586067  -4.586067  5.286957
2022  4.3162904 -3.5007775  -.7903162  -.7903162 3.8198664
2023  -5.298913   2.488734  2.5555406  2.5555406 1.2489274
end

drop if time == 2000

rename (gross-gdp_gr) y#, addnumber 

reshape long y, i(time) j(which) 

label def which 1 `" "gross capital" "formation" "' 2 `" "government" "spending" "' 3 "consumption" 4 "net exports" 5 "GDP"

label val which which 

separate y, by(which)

twoway bar y? time, by(which, legend(off) compact note("") l1title("%", orientation(horizontal)) col(1)) yla(-8(4)8) xla(2001(2)2023) subtitle(, pos(9) nobox nobexpand fcolor(none)) barw(0.9 ..)  xtitle("")

Click image for larger version

Name: gabriel1.png
Views: 1
Size: 69.6 KB
ID: 1774293

Comment

Chen Samulsion

Join Date: Jan 2018
Posts: 923

13 Mar 2025, 04:55

Althoug Nick Cox don't like graph below, I still want to show it if you need. The trick lies in a user-written command -genstack- which you need to install through SSC. And see here: https://www.statalist.org/forums/for...graph-in-stata

Code:

input int time float(gross_capital_formation_gr gov_spend_gr cons_gr net_export_gr gdp_gr)
2000          .          .          .          .         .
2001   -4.57481  2.1087377  1.0260786  1.0260786  1.875098
2002   .3454278  1.3808725  1.6651844  1.6651844 2.9992554
2003  3.6846826  .59398466  -.5998673  -.5998673  1.806385
2004  4.4764614  -2.406074 -1.7193335 -1.7193335  3.092364
2005   5.841976 -1.7591026  -.7534954  -.7534954 3.2104545
2006   4.258789   .9061334   .2080552   .2080552  2.637944
2007  1.3221928 .009202804    .568028    .568028 2.0499046
2008   .6885871  2.4357965  -.4875244  -.4875244  .9954063
2009  -8.777685  11.385446   5.713411   5.713411 -2.915086
2010   6.933267 -2.2268941   -.997973   -.997973 3.0908065
2011  2.8509014 -1.5347315  -1.922189  -1.922189 3.1371944
2012   2.964998  -.4162595   .2103034   .2103034 1.7556614
2013  .16138093  -1.678759 -.10612205 -.10612205 2.3258135
2014 -.14841795 -2.1144104 -.29404652 -.29404652  2.873467
2015 -4.2182727   3.016126  3.6071465  3.6071465   .649971
2016 -4.4517307   .8152912  1.1682287  1.1682287  1.038551
2017   3.467971  -1.637724  -.8687419  -.8687419  3.033835
2018  -.7352638 -.10577374 -.15402387 -.15402387  2.742963
2019   -1.43253 -.08672807 -.22868852 -.22868852  1.908432
2020 -1.5957807   9.860924 -1.6115885 -1.6115885 -5.038233
2021   7.231438  -5.544974  -4.586067  -4.586067  5.286957
2022  4.3162904 -3.5007775  -.7903162  -.7903162 3.8198664
2023  -5.298913   2.488734  2.5555406  2.5555406 1.2489274
end

ssc install genstack
genstack gross_capital_formation_gr gov_spend_gr cons_gr net_export_gr, generate(gdp_)
twoway bar gdp_net_export_gr gdp_cons_gr gdp_gov_spend_gr gdp_gross_capital_formation_gr time, barwidth(0.6 0.6 0.6 0.6) || line gdp_gr time, lwidth(thick) xlabel(2001(2)2023) xtitle("") legend(pos(11) col(1))

Click image for larger version

Name: Graph.png
Views: 1
Size: 192.8 KB
ID: 1774302

Comment

Gabriel Temesgen

Join Date: Jul 2017

Posts: 73
#6

13 Mar 2025, 06:20

Dear Nick Cox and Chen Samulsion, thank you very much.
Chen Samulsion, It works.
Best
Gabriel
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#7

13 Mar 2025, 06:59

Thanks to Chen Samulsion -- whose mention of genstack was specific enough to remind me of a previous thread Combined graph in stata - Statalist -- which is asking essentially the same question. As the thread shows genstack is helpful but not essential.

But whatever "it" is -- the unstacked design I prefer or the stacked design coded by Chen -- if it works, then I don't understand the data. Consider 2009 and 2020 where GDP growth is negative but most components are positive. That seems inconsistent.

Either the data are incomplete or they aren't genuinely additive.

Stacking bars only makes sense for variables in the same units that are statistically additive. Supeirmposing a line graph only makes sense if the units are the same too.
Comment
Gabriel Temesgen

Join Date: Jul 2017

Posts: 73
#8

17 Mar 2025, 03:51

Thank Nick Cox for the detail clarification when to use Stacking bars and Supeirmposing a line graph.
I will check also my data for why the changes are positive in covid-19 pandemic period. for the change in government spending is clear as governments were using their fiscal instrument to stimulate the economy.
Thanks for suggesting the unstacked bar and clarification.
Best
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#9

17 Mar 2025, 03:57

Thanks for #8, but I don't think you're addressing my most important point. If the variables (other than GDP growth) are not additive, stacking them is invalid. If they are additive, why does GDP growth in 2009 and 2020 show negative when other components show net positive?
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1409
#10

17 Mar 2025, 04:42

The data is of the four components whose levels should add up to the level of GDP. But therefore, their growth rates do not by themselves add up to the growth rate of GDP; the individual growth rates weighted by the respective component's share of GDP, would add up to the growth rate of GDP.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#11

17 Mar 2025, 05:26

As I often have cause to point out, I am not an economist and I certainly have no information about the data beyond that presented here. I was wondering whether the different variables were somehow estimates of the contribution to GDP growth rate. But once again, if that's not true, stacking is inappropriate.

Also if Hemanshu Kumar is correct, and I am inclined to believe #10, I still can't follow how 2009 and 2020 square up at all.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1409
#12

17 Mar 2025, 06:49

I still can't follow how 2009 and 2020 square up at all.

Basically the last column (the GDP growth rate) is a weighted average of the preceding four columns (where the weights are positive and between 0 and 1, since they are all shares of GDP). All that would rule out is a situation where all the component growth rates have one sign and the GDP growth rate has the opposite sign. That would not make sense. But some positives and some negatives can add up (with appropriate weights) to either a negative number for GDP growth overall, or a positive one.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#13

17 Mar 2025, 08:25

Again, I understand the principle, quite thoroughly I believe, but please explain 2009 and 2020. In 2020 for example the growth rate is more negative than the sum of the negative components,
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1409
#14

17 Mar 2025, 09:07

I don't see the issue with 2009, but yes, 2020 does seem anomalous. I know at least one complication is that GDP is measured by different methods. The GDP number calculated using the income method for example, does not exactly add up to the sum of these components (this is the sum of expenditures) due to estimations and imputations that are used in these computations by the statistical authorities.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35709
#15

17 Mar 2025, 09:34

Hemanshu Kumar Thanks for the extra detail. I get that you're trying to think of good reasons for why results might appear contradictory but saying that GDP growth data may be calculated in a way that is inconsistent with the other measures doesn't to me make any of these graphs appear more reasonable.

I am going to leave it there, but it's a problem for Gabriel Temesgen if the exercise doesn't make.full sense. Better to find out now than later,
Comment

Announcement

add a stacked bar chart and overlaying a line graph

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment