If you are a graphic designer you already know too much about beauty and aesthetics. If you want to play with data it’s time to grow up and learn some real graphs. I mean it. And I have the ideal book you: Creating More Effective Graphs, by Naomi B. Robbins.
- Color: No color;
- Fancy graphs: None;
- Book cover: an obscure graph type;
- Recommended graphs: not easily implemented in Excel;
- Graph applications: corporate IT never ever heard of R or S;
- Aesthetics: What?
Probably 90% of all data visualization books published recently want to become coffee table books, and there is nothing inherently wrong with that. Creating More Effective Graphs, on the other hand, is more like the Ugly Duckling. Can it mature into a swan? Let’s find out.
- What is an effective graph? If we want to create more effective graphs, it’s a good starting point to define what we mean by “effective graph”. According to the author, “[o]ne graph is more effective than another if its quantitative information can be decoded more quickly or more easily by most observers.”
- Limitations of Some Common Charts and Graphs. Naomi B. Robbins discusses issues like 3D effects, the limitations of pie charts and stacked bars, and how to measure the real gap between curves in a line chart.
- Human Perception and Our Ability to Decode Graphs. This chapter presents the major findings of the well-known research by Cleveland and McGill about judgment accuracy when decoding quantitative information (for example, are we more accurate when judging length or slopes?).
- Some More Effective Graphs in One or Two Dimensions. Things are getting more interesting now. The author discusses graphs like dot plots, histograms, box plots, scatter plots and time series with line charts and cycle plots (instead of plotting a regular data series, plot all the data values for each month, then the next month and so on). She also discusses techniques like sorting, jittering, using logarithmic scales and scatter plot smoothers like loess (Jon Peltier has a blog post on how to implement this in Excel). Robbins prefers dot plots to bars charts, and who can blame her?
- Trellis Graphics and Other Ways to Display More Than Two Variables. This chapter is dedicated to trellis displays, scatterplot matrices, mosaic plots, linked micromaps, parallel coordinate plots and the Nightingale rose.
- General Principles for Creating Effective Graphs. Robbins adapts Cleveland’s principles that lead to visual clarity and clear understanding. She says: “If I were asked the message of this book in five words, I would say ‘Make the data stand out.’ If limited to three, my words would be ‘Emphasize the data’”. This is similar to Tufte’s “Above all else, show the data”. The author also discusses how to de-clutter a chart by de-emphasizing grid lines and making them visually distinguishable from the data.
- Scales. Chapter 7 is all about scales, including scale breaks, logarithmic scales, zero in scales and the less-known “banking to 45º” (also proposed by Cleveland) as a rule to help define aspect ratio. “Zero in scales” (should a scale start at zero?) is a slippery issue, and I wouldn’t expect a definitive answer from the author. We all seem to agree that bar charts should start at zero due to perceptual issues, but line charts have no simple answer. The author discusses several examples of what can go wrong with scales and how to correct them. A chart is a visual translation of an underlying table and if something is lost in translation that’s often because of wrong scales. So, it’s a good idea to have a whole chapter on scales.
- Applying What We’ve Learned: Before and After Examples. Here the author shows several examples of charts that could be improved using trellis displays.
- Some Comments on Software. This chapter discusses briefly some of the software used to create charts, namely S, R, Illustrator and Excel.
- Questions and Answers. The author lists nine common questions and answers them.
What I Like…
I have mixed feelings about this book. I like it because I belong to the same data visualization tribe, and that’s enough to buy the book. I like it because there is a full chapter on scales (if you want to lie with charts you start by forcing the axis to tell, well, a different true). I like it because most of the book has a stronger scientific foundation (Cleveland’s research) than most of the data visualization books. There is a rational reason to select this graph and not that one.
… and What I Don’t
There are things I don’t like. We have to talk about color. It takes some courage to create a color-less data visualization book, but today (or even six years ago, when the book was published) you cannot ignore color if you want to create more effective graphs. Color is a powerful pre-attentive variable, it’s not just about aesthetics. And there isn’t a single relevant paragraph about color.
This book is not exactly a short novel, but we often feel that much remains to be said in each chapter. When the author writes about human perception perhaps she could write a few paragraphs about our working memory. When she writes about Cleveland’s and McGill’s research it would be nice to have a footnote telling us about David Simkin, Reid Hastie or Ian Spence. These authors tell us, for example, that Cleveland’s findings are not absolute (how we decode graphs depends on the task at hand).
The Tribe’s Credo
The tribe’s credo says that we should hate pie charts, and Naomi B. Robbins is in full compliance. I must be very dumb, but I really don’t understand why the authors keep humiliating the poor pie chart. Sure it’s easier to see the rank and the differences between data points using a bar chart or a dot plot.
But what is the total percentage of the top three values? Do you really want to use the dot plot to answer this question? Jon Peltier would say one word: “Pareto“. But that’s not the same thing (most people don’t know how to read a Pareto chart). It’s safe to say that a 2D bar chart is more effective than its 3D version, but you can’t say that a bar chart or a dot plot is more effective than a pie chart. First, they are not interchangeable; second, it all depends on the task at hand or the question you are asking.
In section 2.5 there is an unfair comparison between bubble charts and dot plots.
Should You Buy It?
I actually liked Creating More Effective Graphs, and would buy it again, in spite of these details. But I’m intrigued: who is the target audience? Statisticians? Die-hard Tufte fans? Hard to know. The business user is not. That’s for sure.
PS: This is a review of the Kindle edition. I had this dumb idea of buying some data visualization books in the Kindle format – don’t make the same mistake, please. This book is much more expensive than many well-known data visualization books, but that’s not the author’s fault and I do not discuss that point here. But the Kindle version is more expensive than the printed one, and that’s absurd beyond words.