I have a confession to make: my past is paved with chart-making sins, including some capital ones (yes, 3D pie charts, too). But years ago I saw the light in Edward Tufte’s The Visual Display of Quantitative Information and since then I’ve been avoiding eye-candy temptations. Now I do my best to pursuit the path of data visualization virtue.

Every God Has His Moses: Edward Tufte and Stephen Few

Some time after that first revelation, I stumbled on Stephen Few’s Show Me the Numbers and I though: “wow, Tufte for business!”. As a father of twins, I know that good things come in pairs, and now I had two great role models to help my recent conversion.

Or should I say one and a half?

Edward Tufte and Stephen Few are often cited together, as if they were a single entity. For many of us, simple mortals, Stephen Few is some kind of translator of God’s voice. Given Few’s background, that wouldn’t be completely inappropriate…

For some time that’s how I looked at Few’s work on charts and data visualization. But I was wrong. They do share similar views about basic data visualization principles. And they seem to share the same level of stubbornness, too. But there is a major difference.

Tufte, the Artist vs. Few, the Engineer

Tufte is an artist. His data visualization principles derive from Ludwig Mies van der Rohe’s minimalism, and in that sense, he approaches charts from an aesthetic point of view. His charts are as beautiful as a chart can be, if you happen to like the aesthetic minimalism.

I don’t know how and when Few became aware of the need for better data visualization. But he embraced Tufte’s principles not because he is an aesthete like Tufte, but because he values efficiency and those principles happen to improve it.

Stephen Few would never title a book “Beautiful Evidence”. He doesn’t mind to use Excel to create his chart examples, while Tufte needs full control of details like kerning (and he uses a designer’s tool, Adobe’s Illustrator).

On the other hand, Tufte would never write a book about dashboards (Beautiful Dashboards? brrrr…). From an actionable, business visualization point of view, Tufte is The Visual Display… Almost everything else is beautiful, yes, and perfect for the coffee table.

And while Tufte escaped Flatland for good, Few still keeps both feet firmly on the ground, discussing BI tools, pie charts or irregular time series (and I don’t think his new book changes that).

The Need for a New Business Visualization Model: the Emotional Link

Both approaches are very consistent and they give you a set of guidelines that you can apply to all your charts and adopt as a general framework.

What I am not comfortable with is their positivist attitude, specially in Few. Because Tufte’s charts are aesthetically pleasing, we can derive some emotion from that. In Few’s case, his charts are purely functional.

I still don’t know where to draw the line between purely rational/functional visualizations and the eye-candy. Let’s see this pattern:

Boy meets girl, boy gets girl, boy loses girl, boy gets girl back.

Do you feel emotionally overwhelmed? No? Do you even care about the story? Do you even care about the boy and the girl? Let’s try again:

John fell in love with Anna the moment she spilled coffee on his shirt.

This sounds much more interesting. Add three more sentences and you’ll complete the boy-meets-girl pattern. Both versions share the same pattern, but the second one adds some (perhaps irrelevant) detail and creates an emotional link between the audience and the characters.

You need that in data visualization, too. You don’t have to cry because you chart shows a market share drop in Alaska, but you must connect with the reality behind the chart and the data.

The Need for a New Business Visualization Model: Interaction

Jacques Bertin says that knowledge is built by the user when interacting with the chart. Why interaction (and animation) is absent from Tufte’s and Few’s books is something I don’t really understand.

Although I respect Tufte and Few, I feel that there are pieces missing in their theories. We can borrow some pieces from Bertin’s work (and Tukey’s?) and that will surely help, but the real issue here is to find the balance between the need to correctly (bureaucratically?) display the data and the emotional response that helps to keep the audience interested.

Back to you, a very simple question: what are Tufte and/or few missing? What pieces do we need for a XXI century visualization?

Photo credits: ~L. and David Zellaby.

{ 18 comments }

In a recent article for the New York Times, Paul Krugman, the 2008 winner of the Nobel Prize in Economics, writes:

“The banking industry that emerged from that collapse [the Great Depression] was tightly regulated, far less colorful than it had been before the Depression, and far less lucrative for those who ran it. Banking became boring, partly because bankers were so conservative about lending (…).Strange to say, this era of boring banking was also an era of spectacular economic progress for most Americans.”

Now that history is repeating itself, I believe that this applies to data visualization too. The 3D pie chart with pseudo-realistic textures, charting tools like Dundas, Crystal Xcelsius and Excel 2007’s charting engine, they all share the same spirit of the times that nurtured the sub-prime lending mess and all that followed. The spirit of the times that rewards illusory short-term results and effectively dismisses consistent, well-founded, long term strategies.

Can’t We Learn?

We may be scared of the future, but are we scared enough? Krugman again:

“Despite everything that has happened, most people in positions of power still associate fancy finance with economic progress. Can they be persuaded otherwise? Will we find the will to pursue serious financial reform? If not, the current crisis won’t be a one-time event; it will be the shape of things to come.”

Many business managers still associate fancy charts with serious decision-supporting tools. This is the right time to change. Eye-candy, “professional looking” charts are sub-prime charts, and if you take them seriously, they’ll do to your business what sub-prime lending is doing to the world economy.

Take a Chart Stress Test

Good charts are invisible. If your audience’s first comments go to your chart format and design, that’s a sure sign that something is wrong. Get back to your charting tool and create a new chart. Do it as many times as necessary. The audience must see and comment the data patterns only, not the chart.

Charts don’t have to be boring. ”If the statistics are boring, then you’ve got the wrong numbers” says Tufte. If you need your daily adrenaline shot, get it from the insights a good chart provides, not from the chart design.

What do you think? Is this crisis creating a serious “back to the basics” spirit that will influence the way organizations optimize their resources, including the time they spend creating useless charts and presentations?

Photo credit: Steve Kay

{ 5 comments }

Some years ago, as part of my (then) new job, I had to maintain a monthly updated Excel dashboard. It was a maintenance hell, I hated it, but I couldn’t change it because of my poor Excel skills.

“This is stupid, there must be a better way”, I kept saying to myself.

So, I searched, and searched, and searched, and within a few months I became the most skillful Excel user in my company and I could solve my initial problem. An all day long update turned into a ten minute task. I revamped the entire dashboard, but I kept the same user interface.

An Excel Dashboard is a Jigsaw Puzzle. Learn How to Solve It.

Back then, I was able to use some of the more common formulas, like most Excel users do. But if you want to create Excel dashboards you must understand how everything fits and works together. If you don’t, expect nothing less than a spreadsheet hell. You should never underestimate that.

I hate repetitive and time-consuming tasks, and I avoid them like the plague. If I suspect that a co-worker, after asking for a report, will come back and ask for different scenarios, I usually offer that functionality from the start. It’s a win-win situation: it doesn’t take longer than a static report, I avoid extra work and the user loves to play with his/her new toy. :)

Make Sure You Market Your Skills

As I said, I kept the same user interface in that dashboard. It proved to be a huge mistake, from a personal marketing/career perspective. After all these years, I still believe I made a remarkable job but, because I kept all the changes behind the scenes, they went unnoticed by the users — including the boss. Well, marketing and promotion is a big step out of my comfort zone, and office politics is not exactly what I do best…

Lessons Learned

That old dashboard was created by an above-average Excel user, but he failed to understand this basic concept: an Excel dashboard is a jigsaw puzzle, and fewer pieces makes it simpler to solve (for example, a simple pivot table can often replace dozens of formulas).

Go beyond the individual formulas. Create a project that forces you to learn how they work together (that’s what my dashboard tutorial is all about).

And never make your outstanding job invisible, use your Excel skills to work less but try to make sure that too much Excel will not harm your career

{ 0 comments }

If you are a market researcher, and you want to make sure that you get more reliable results for a subgroup in a survey, what do you do? You must increase the overall sample size (and spend a lot of money), right?

Actually, you don’t.

You can oversample that group only, and then weight it down to its known proportion in the population. For example, you may want to increase the number of managers and decrease the number of housewives (because the former are usually more heterogeneous than the latter). Oversampling is a common research method, and a very cost-effective way to get precise estimates for a subgroup.

This is a real-world solution, and if we have finite resources to solve a real-world problem, resource allocation must be part of the equation. Higher variability usually demands for more resources.

Why is this relevant in a blog about charts and information visualization? Glad you ask.

The Great Irregular Interval Debate

Let me give you an example. A while back, Jon Peltier wrote in his blog:

I don’t understand the obsession with an equal date interval. A line chart need not show the trend of only evenly-spaced data. Suppose I am observing temperatures, and I decide for simplicity that where the temperature hasn’t changed, or where it has been changing steadily, I do not need to record every value. Overnight after the temperature has dropped, I can characterize my temperature profile with one point per hour. As the sun rises, I may need more frequent recordings to capture the morning warm up. Then the clouds blow over, it starts to rain, then it clears up again; I may need minute-by-minute data points to track this. When I make my plot, is it any less relevant because the spacing of the data ranges from minutes to hours?

This is oversampling, and a wise resource allocation, too. In a survey, you weight the subgroup down to its right proportion, and that’s also what you do in a chart, when irregular date intervals are displayed proportionally.

Stephen Few disagrees:

Using a line to connect values along unequal intervals of time or to connect intervals that are not adjacent in time is misleading.

Furthermore:

How could we trust graphical representations of time series or frequency distributions if their shapes could have been altered by inconsistently manipulating the sizes of intervals along the scale, either arbitrarily or intentionally to deceive? We can derive meaning from patterns and trends that these graphs display only if the intervals are consistent.

wrong-line-chartHe exemplifies his argument with these two charts (actually, there are three, but we can safely disregard the third one).

The first chart displays the correct annual sales. The second one displays arbitrarily grouped annual sales and, obviously, its pattern is quite different.

Now, the second chart is plain wrong, so I am not sure if you can use it to argue against unequal intervals.

corrected-line-chart

Let’s use a fairer example with the same dataset and the same arbitrary grouping.

Compare the orange line with Few’s first chart. I actually don’t see much difference. Sure you lose a lot of detail, but the basic pattern is there. Instead of sums, I am using averages (you can’t compare a single year with the total sales of three or four years).

The other two lines show the difference between equal and unequal intervals. The brown line displays the data points unequally spaced while the gray one uses equal intervals (Few’s second chart). I had to make some assumptions regarding the reference date, so this is not the best example, but it is good enough to show the potential risk of using equal intervals with unequal intervals of time.

Bottom line, oversampling is a useful method for better resource allocation. We can view irregular time series as some sort of oversampling, provided there are no missing values and irregular intervals in the chart are consistent with intervals in the time series.

Grouping data points is always a tricky issue, and Stephen Few show it clearly, but we shouldn’t infer that “line graphs and irregular intervals is an incompatible partnership.”

(When using time series in Excel, make sure that category axis labels are recognized as dates. Alternatively, use a scatter plot with connected data points.)

{ 8 comments }

poverty-ratios-skyscraperTextures. 3D. Pie charts. Primary colors. Trends hidden behind labels. Backgrounds. Pie charts again.

Clear signs of a bad chart, right? Right. It is so easy to spot a badly designed chart that you can use a computer to do it. Don’t waste your time.

Let’s stop discussing the obviously wrong and start discussing the useless right. Like this chart here. (I’ve borrowed the dataset Nathan used in one of his visualization challenges – some interesting entries and great discussion there, by the way).

There may not be anything really, really wrong with this chart, but it reflects a bureaucratic way of thinking about data and data presentation where every single data point must be clearly shown and labeled. Just like a table.

Listen, unless you work for a statistics office, you should never create a chart like this. I know, it’s irresistible to check how well my state ranks, but identifying each and every data point in a virtually limitless bar chart makes no sense in most cases.

Do you read the labels between the top five and the bottom five? Charts like this encourage look up of individual data points, and for that a table is probably a better option. If anything, a skyscraper bar chart is a clear sign of loss aversion.

A Flexible Bar Chart: Introducing the Accordion Bar Graph

How do you graph a categorical variable with more than, say, 20 data points without creating a skyscraper? This is what I have in mind:

  • You must retain the overall pattern, so you can’t remove data from the chart;
  • Create one or more focus area (top five and bottom five, for example);
  • Gaps between bars should be larger in these focus areas, so that labels can easily be added.
  • Minimize the height of the remaining bars and remove the labels;

The chart should look like this:

focus-context-bar-chart

 

I like the accordion metaphor and I’m playing with it. An interactive version could use a simple event to create a focus inside the context area, so when the user moves the mouse the bar is enlarged and the label is shown.

What do you think? Do you agree that skyscraper bar charts are (almost) useless or should we focus on reducing the number of data points instead? How would you improve this design? Please share your comments and charts below.

Update

Well, if you want to know how to do this in Excel and read a great discussion about it, Jon wrote Accordion Chart for Jorge. He not only discusses some of the options but also shares the Excel file with us. Thanks Jon! And Dick, over the Daily Dose of Excel wants to make sure that your state is automatically highlighted (Ego Charts). Nice “quarter step”!

{ 22 comments }

No, traditional charts are useless  in our complex world

playfair-piechartOver the next 25 years, we will need new visualization tools to replace traditional charts.

As you know, line, bar and even pie charts first appeared 200 years ago, with William Playfair, and perhaps until 25 years ago, they were good enough helping us to make sense of our data. Before computers, they were crafted by graphic designers. Kids in schools drew them using millimetric paper.

Lotus 123 and Harvard Graphics were the most popular charting tools in the early days of personal computers. With those tools (and later, with Excel), the charting landscape changed forever. Some charts vanished, either because they weren’t simple enough and/or didn’t make it into the chart gallery (I miss trilinear plots – yes, Jon, I know how to create them in Excel, but still…), while others should never have been allowed into that gallery.

[click to continue…]

{ 7 comments }

While playing with some county-level data, I stumbled upon what seem to be a secret message hidden in a bubble chart:

triangle-cook-county

Just call me paranoid, but let me ask you this: what is a triangle doing there and why on Earth would the hole in the middle point exactly to Cook County IL, home of President Barak Obama? Isn’t that weird?

I decided to dig a little deeper and visited Cook County’s website and look what I have found in the home page:

dreadful-pie

A sophisticated device that looks like a dreadful pie chart!

Who could be using Excel charts to send secret messages? Are these charts connected, two pieces of a dangerous puzzle? What are their intentions? I must find out.

Someone is knocking on my door…

(Well, this is just another bug in Excel 2007… In my more serious next post I’ll answer this question: “Are traditional charts dying?”. Stay tuned!)

{ 1 comment }

Now is the winter of our discontent
Made glorious summer by this sun of York;
And all the clouds that lour’d upon our house
In the deep bosom of the ocean buried.

By the way, black & white is also a great starting point for better charts.

{ 6 comments }

Hans Rosling

by Jorge

Hans Rosling was here in Lisbon today, for one of his remarkable presentations. It seems that almost no one in the room new about his TED talks and, of course, everyone loved his charts. He gave his presentation in Portuguese, so some extra points there too…

If you just return to planet Earth and don’t know who Hans Rosling is let me briefly discuss his role in the information visualization field.

Professor Hans Rosling became well-known around three year ago because of his remarkable presentation at TED (you can find it here). He was invited again next year and in his new presentation his slogan “seemingly impossible is possible” is defined in a memorable ending (you must see for yourself).

Rosling co-founded Gapminder, “a non-profit venture promoting sustainable global development and achievement of the United Nations Millennium Development Goals by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels” (from the About page).

Gapminder developed Trendalyzer, a charting tool that basically shows a time series in an animated bubble chart. Audiences love that, and with Rosling describing what is happening it is a quite impressive experience.

After the TED presentations Google acquired Trendalyzer and a striped down version can be used in the Google spreadsheet. A while back I used it to display population trends. Click on the image belowto open the chart.

Dependencies Young vs. Old

Dollar Street: Life Behind Statistics

Another interesting application created by Gapminder is Dollar Street (you can download it here). We often are unaware of reality hidden behind statistics. For example, how do people live with less than a dollar a day? In Dollar Street, you can select a house (an income level) and you can see photo-panoramas of each room in the household. There is a talk at Google TechTalks by Rosling’s son, Ola, where he presents Dollar Street.

Hans Rosling and Al Gore

It is interesting to note that, while Hans Rosling became famous because of his TED presentations using a new charting tool, Al Gore was awarded the Nobel Peace Prize. Both men use visuals extensively to raise awareness on issues like poverty and global warming. Al Gore uses a presentation software (Keynote) and his presentations were designed by Duarte. You can see Al Gore at TED here.

Hans Rosling and Business Visualization

Several of my co-workers wanted to discuss with me the use of Trendalyzer-like charts in their presentations. Trendalyzer creates very eye-catching charts, so that’s understandable.

I had to explain that, while these displays are much better than the usual Powerpoint presentations, they need a fairly long time series and and there must be some kind of global trend. Wasting time looking at bubbles jumping up and down is not exactly my idea of fun (or work). And they will become boring.

More important than that: must organizations don’t really know what information visualization is about. They don’t know how to use charts to find actionable patterns in the data. They don’t know how to use charts to communicate those patterns. They are handcuffed to the 3D flying and exploded pie chart paradigm. Replacing that paradigm needs a clear assessment of corporate needs, a long term commitment, a definition of best practices and, obviously, some training. In the absence of these, animated bubbles are just a new fad.

{ 0 comments }

clown

Why do people insist on using “professional looking charts” in their presentations? If I wanted to divert the audience’s attention from the data, I would get a professional clown suit, instead. I would look professional. Not exactly the professional-looking presenter people expect in a corporate environment, but nevertheless a professional.

Meet professional-looking Mr. and Mrs. Gulliver and their Lilliputian friends (courtesy from SmartDraw):

bad-bar-population-chart

(This may be a pet peeve of mine, but whenever I hear the expression “professional-looking charts” I reach for my Browning.)

{ 15 comments }