Uncombining data points when graphing longitudinal data with profileplot

Georgia Richards

Join Date: Apr 2020

Posts: 5
#1

Uncombining data points when graphing longitudinal data with profileplot

27 May 2020, 21:40

Hi there,

I'm using the "profileplot" command to plot sales data (public expenditure, GBP per 1000 population) over time (from 2013 to 2018) for 31 countries.

My code thus far is:
profileplot cost_13_pop cost_14_pop cost_15_pop cost_16_pop cost_17_pop cost_18_pop, ///
by(Country2) graphregion(fcolor(white)) xtitle(" ") ///
xlabel(1 "2013" 2 "2014" 3 "2015" 4 "2016" 5 "2017" 6 "2018") msymbol(i) ///
legend(cols(1) pos(2) size(vsmall)) ///
ytitle("Public expenditure" "(£ per 1000 population)" " ")

which produces this graph:

I have four questions I require help with:
1. the graph seems to combine the 31 countries into 15 - is there a way around this or is 15 the max number the graph can handle?
2. can "mean" be removed from the legend?
3. can "Variables" be removed from the x-axis? I've tried using xtitle(" ") but this didn't work
4. how can I order the countries in the legend by descending (i.e. Ireland is the top line so I want this to come first in the legend rather than alphabetical order)

Thanks for your time and advice,
Georgia
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35754
#2

28 May 2020, 03:15

profileplot is a community-contributed command from https://stats.idre.ucla.edu/stat/stata/ado/analysis (as you are asked to explain: https://www.statalist.org/forums/help#stata 12.1).

It's just a wrapper for xtline, overlay, which you might just as well use directly. However, you are holding panel or longitudinal data in wide form, which isn't a good idea for most Stata purposes. I will come back to that later in this post.

What you are seeing is just a general default that Stata has for the number of legend elements.

You don't give a data example, but anyone can run this script and see the same problem:

Code:

clear set obs 183 set seed 2803 egen year = seq(), from(2013) to(2018) egen id = seq(), block(6) gen y = exp(rnormal(0, 1)) xtset id year xtline y, overlay

Frankly, your graph is likely to strike many readers as a mess. The problem is not in goodwill, but in people's sheer inability to untangle spaghetti.

If you worked harder at different line patterns as well as colours, the legend would be about twice as large, which is not the way to go.

I would use logarithmic scale for such data, assuming that there aren't any zeros.

https://www.stata-journal.com/articl...article=gr0080 covers some (but by no means all!) of what can be said here constructively. It's currently behind a paywall, but I imagine that the author would be willing to send you a copy if you send an email. (In olden days, people used to send little postcards to authors asking for paper reprints.)

I would consider 6 panels with about 5 countries each.

OR

using a "front-and-back plot" as discussed at https://www.statalist.org/forums/for...ailable-on-ssc That's quite a long thread, but anyone can skim and skip through it.

Your dataset isn't large. You could post it here using

Code:

help dataex dataex cost_13_pop cost_14_pop cost_15_pop cost_16_pop cost_17_pop cost_18_pop Country2

If the call to dataex fails, then

1. You are using a version of Stata earlier than the present, and it's asked that you say so (https://www.statalist.org/forums/help#version)

2. You should just install dataex using ssc install dataex (https://www.statalist.org/forums/help#stata)

That done, however, a reshape long is a much better idea for these data, using say

Code:

reshape long cost_ , i(Country2) j(year) string replace year = subinstr(year, "_pop", "", .) destring year, replace replace year = year + 2000

However, you may well have other variables, in which case you may need more detailed advice.

NOTE: I haven't looked hard at the code, but showing a mean is a default for profileplot.

Last edited by Nick Cox; 28 May 2020, 04:13.
Comment
Georgia Richards

Join Date: Apr 2020

Posts: 5
#3

29 May 2020, 03:49

Thanks Nick, this was super helpful.
I've reshaped the data to long form and used the subsetplot command.
My figure is as follows:

Thanks again!
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35754
#4

29 May 2020, 05:44

Thanks for coming back with a report. What's the story with Ireland?

subsetplot (SSC) is considered superseded by fabplot (SSC).

The suggestion to use logarithmic scale remains, to which I add that alphabetical order is rarely best.
Comment

Announcement

Uncombining data points when graphing longitudinal data with profileplot

Comment

Comment

Comment