Stupidest pie chart of the year 2011

I’ve just been looking at the Stupidest bar chart of the year 2011 and I’ve been inspired to submit my Stupidest pie chart of the year 2011. I won’t say from where I obtained it and I’ve drawn the data myself to save from annoying the original authors. And here it is (click to enlarge):

It represents the amount of maternity and paternity leave available in different countries. Pie charts are often not a very good idea, but in this case they are the worst possible idea. Pie charts should be used when the whole pie has some meaning. A whole customer base, individuals living in Nottinghamshire, something like that. The whole of these pies represents- what? The total amount of leave in these countries. This has no real world meaning at all, and the whole point of the pie chart is lost.

Even worse, underneath the pie chart they are forced to write “Spare a thought for parents in the USA and Sierra Leone… paid maternity leave 0 weeks, paid paternity leave 0 weeks.” because you cannot represent 0 on a pie chart! This should have set alarm bells ringing. One better way to plot these data:

This is absolutely factory-fresh out of the box settings, there are many ways to improve this plot and other types of plots which could be used. This plot improves on the previous one by:

1. Better able to compare levels of leave in each country
2. Better able to compare levels of each type of leave
3. No need for data labels which spell out number of weeks in each country and contribute to very low data:ink ratio
4. Able to display zero points which puts the marginal notes about USA and Sierra Leone on the plot!

Code for both:


par(mfrow=c(1, 2))

maternity=c(18, 16, 39, 17, 16, 0, 0)
paternity=c(3, 3, 14, 3, 14, 0, 0)
country=c("China", "Holland", "UK", "Greece", "France", "USA", "Sierra Leone")

pie(maternity, labels=paste(country, maternity, "weeks"), main="Maternity leave")

pie(paternity, labels=paste(country, paternity, "weeks"), main="Paternity leave")

# 2 minute bar chart

par(mar=c(10, 4, 4, 2) + 0.1)

barplot(maternity, names.arg=country, ylab="Maternity leave in weeks", las=3)

barplot(paternity, names.arg=country, ylab="Paternity leave in weeks", las=3)

New Year’s resolutions

It’s that time of year again, so what do I need to keep doing or do differently this year?

1. Keep being organised with Evernote and Getting Things Done. I found recently that I was processing so many bits of paper and information that I just gave up. Only the most important stuff got stored in my brain, the rest was discarded. Having started using Evernote and guided by Getting Things Done I have a massive brain pack to store everything I can’t use at the moment.

2. Produce reproducible reports. I think it’s a very important area of growth in statistical analysis and, moreover, I think it will make my work better (for a start, I’ll spend much longer commenting code and making it readable if it’s being published).

3. Learn something new. I need to learn various bits and pieces relating to interactive graphics this year (Tcl/ TK? Java? HTML5?) but I need to expand my statistical knowledge too. Likely candidates include bootstrapping and Bayesian analysis.

4. Publish, publish, publish. I’m sitting on too many datasets and pieces of work that should be published. Blogging is a good way of sharing small pieces of work and progress but it doesn’t require the care and attention which academic publication entails. I must publish more.