In a recent article for the New York Times, Paul Krugman, the 2008 winner of the Nobel Prize in Economics, writes:

“The banking industry that emerged from that collapse [the Great Depression] was tightly regulated, far less colorful than it had been before the Depression, and far less lucrative for those who ran it. Banking became boring, partly because bankers were so conservative about lending (…).Strange to say, this era of boring banking was also an era of spectacular economic progress for most Americans.”

Now that history is repeating itself, I believe that this applies to data visualization too. The 3D pie chart with pseudo-realistic textures, charting tools like Dundas, Crystal Xcelsius and Excel 2007’s charting engine, they all share the same spirit of the times that nurtured the sub-prime lending mess and all that followed. The spirit of the times that rewards illusory short-term results and effectively dismisses consistent, well-founded, long term strategies.

Can’t We Learn?

We may be scared of the future, but are we scared enough? Krugman again:

“Despite everything that has happened, most people in positions of power still associate fancy finance with economic progress. Can they be persuaded otherwise? Will we find the will to pursue serious financial reform? If not, the current crisis won’t be a one-time event; it will be the shape of things to come.”

Many business managers still associate fancy charts with serious decision-supporting tools. This is the right time to change. Eye-candy, “professional looking” charts are sub-prime charts, and if you take them seriously, they’ll do to your business what sub-prime lending is doing to the world economy.

Take a Chart Stress Test

Good charts are invisible. If your audience’s first comments go to your chart format and design, that’s a sure sign that something is wrong. Get back to your charting tool and create a new chart. Do it as many times as necessary. The audience must see and comment the data patterns only, not the chart.

Charts don’t have to be boring. ”If the statistics are boring, then you’ve got the wrong numbers” says Tufte. If you need your daily adrenaline shot, get it from the insights a good chart provides, not from the chart design.

What do you think? Is this crisis creating a serious “back to the basics” spirit that will influence the way organizations optimize their resources, including the time they spend creating useless charts and presentations?

Photo credit: Steve Kay

{ 5 comments }

Some years ago, as part of my (then) new job, I had to maintain a monthly updated Excel dashboard. It was a maintenance hell, I hated it, but I couldn’t change it because of my poor Excel skills.

“This is stupid, there must be a better way”, I kept saying to myself.

So, I searched, and searched, and searched, and within a few months I became the most skillful Excel user in my company and I could solve my initial problem. An all day long update turned into a ten minute task. I revamped the entire dashboard, but I kept the same user interface.

An Excel Dashboard is a Jigsaw Puzzle. Learn How to Solve It.

Back then, I was able to use some of the more common formulas, like most Excel users do. But if you want to create Excel dashboards you must understand how everything fits and works together. If you don’t, expect nothing less than a spreadsheet hell. You should never underestimate that.

I hate repetitive and time-consuming tasks, and I avoid them like the plague. If I suspect that a co-worker, after asking for a report, will come back and ask for different scenarios, I usually offer that functionality from the start. It’s a win-win situation: it doesn’t take longer than a static report, I avoid extra work and the user loves to play with his/her new toy. :)

Make Sure You Market Your Skills

As I said, I kept the same user interface in that dashboard. It proved to be a huge mistake, from a personal marketing/career perspective. After all these years, I still believe I made a remarkable job but, because I kept all the changes behind the scenes, they went unnoticed by the users — including the boss. Well, marketing and promotion is a big step out of my comfort zone, and office politics is not exactly what I do best…

Lessons Learned

That old dashboard was created by an above-average Excel user, but he failed to understand this basic concept: an Excel dashboard is a jigsaw puzzle, and fewer pieces makes it simpler to solve (for example, a simple pivot table can often replace dozens of formulas).

Go beyond the individual formulas. Create a project that forces you to learn how they work together (that’s what my dashboard tutorial is all about).

And never make your outstanding job invisible, use your Excel skills to work less but try to make sure that too much Excel will not harm your career

{ 0 comments }

If you are a market researcher, and you want to make sure that you get more reliable results for a subgroup in a survey, what do you do? You must increase the overall sample size (and spend a lot of money), right?

Actually, you don’t.

You can oversample that group only, and then weight it down to its known proportion in the population. For example, you may want to increase the number of managers and decrease the number of housewives (because the former are usually more heterogeneous than the latter). Oversampling is a common research method, and a very cost-effective way to get precise estimates for a subgroup.

This is a real-world solution, and if we have finite resources to solve a real-world problem, resource allocation must be part of the equation. Higher variability usually demands for more resources.

Why is this relevant in a blog about charts and information visualization? Glad you ask.

The Great Irregular Interval Debate

Let me give you an example. A while back, Jon Peltier wrote in his blog:

I don’t understand the obsession with an equal date interval. A line chart need not show the trend of only evenly-spaced data. Suppose I am observing temperatures, and I decide for simplicity that where the temperature hasn’t changed, or where it has been changing steadily, I do not need to record every value. Overnight after the temperature has dropped, I can characterize my temperature profile with one point per hour. As the sun rises, I may need more frequent recordings to capture the morning warm up. Then the clouds blow over, it starts to rain, then it clears up again; I may need minute-by-minute data points to track this. When I make my plot, is it any less relevant because the spacing of the data ranges from minutes to hours?

This is oversampling, and a wise resource allocation, too. In a survey, you weight the subgroup down to its right proportion, and that’s also what you do in a chart, when irregular date intervals are displayed proportionally.

Stephen Few disagrees:

Using a line to connect values along unequal intervals of time or to connect intervals that are not adjacent in time is misleading.

Furthermore:

How could we trust graphical representations of time series or frequency distributions if their shapes could have been altered by inconsistently manipulating the sizes of intervals along the scale, either arbitrarily or intentionally to deceive? We can derive meaning from patterns and trends that these graphs display only if the intervals are consistent.

wrong-line-chartHe exemplifies his argument with these two charts (actually, there are three, but we can safely disregard the third one).

The first chart displays the correct annual sales. The second one displays arbitrarily grouped annual sales and, obviously, its pattern is quite different.

Now, the second chart is plain wrong, so I am not sure if you can use it to argue against unequal intervals.

corrected-line-chart

Let’s use a fairer example with the same dataset and the same arbitrary grouping.

Compare the orange line with Few’s first chart. I actually don’t see much difference. Sure you lose a lot of detail, but the basic pattern is there. Instead of sums, I am using averages (you can’t compare a single year with the total sales of three or four years).

The other two lines show the difference between equal and unequal intervals. The brown line displays the data points unequally spaced while the gray one uses equal intervals (Few’s second chart). I had to make some assumptions regarding the reference date, so this is not the best example, but it is good enough to show the potential risk of using equal intervals with unequal intervals of time.

Bottom line, oversampling is a useful method for better resource allocation. We can view irregular time series as some sort of oversampling, provided there are no missing values and irregular intervals in the chart are consistent with intervals in the time series.

Grouping data points is always a tricky issue, and Stephen Few show it clearly, but we shouldn’t infer that “line graphs and irregular intervals is an incompatible partnership.”

(When using time series in Excel, make sure that category axis labels are recognized as dates. Alternatively, use a scatter plot with connected data points.)

{ 8 comments }

poverty-ratios-skyscraperTextures. 3D. Pie charts. Primary colors. Trends hidden behind labels. Backgrounds. Pie charts again.

Clear signs of a bad chart, right? Right. It is so easy to spot a badly designed chart that you can use a computer to do it. Don’t waste your time.

Let’s stop discussing the obviously wrong and start discussing the useless right. Like this chart here. (I’ve borrowed the dataset Nathan used in one of his visualization challenges – some interesting entries and great discussion there, by the way).

There may not be anything really, really wrong with this chart, but it reflects a bureaucratic way of thinking about data and data presentation where every single data point must be clearly shown and labeled. Just like a table.

Listen, unless you work for a statistics office, you should never create a chart like this. I know, it’s irresistible to check how well my state ranks, but identifying each and every data point in a virtually limitless bar chart makes no sense in most cases.

Do you read the labels between the top five and the bottom five? Charts like this encourage look up of individual data points, and for that a table is probably a better option. If anything, a skyscraper bar chart is a clear sign of loss aversion.

A Flexible Bar Chart: Introducing the Accordion Bar Graph

How do you graph a categorical variable with more than, say, 20 data points without creating a skyscraper? This is what I have in mind:

  • You must retain the overall pattern, so you can’t remove data from the chart;
  • Create one or more focus area (top five and bottom five, for example);
  • Gaps between bars should be larger in these focus areas, so that labels can easily be added.
  • Minimize the height of the remaining bars and remove the labels;

The chart should look like this:

focus-context-bar-chart

 

I like the accordion metaphor and I’m playing with it. An interactive version could use a simple event to create a focus inside the context area, so when the user moves the mouse the bar is enlarged and the label is shown.

What do you think? Do you agree that skyscraper bar charts are (almost) useless or should we focus on reducing the number of data points instead? How would you improve this design? Please share your comments and charts below.

Update

Well, if you want to know how to do this in Excel and read a great discussion about it, Jon wrote Accordion Chart for Jorge. He not only discusses some of the options but also shares the Excel file with us. Thanks Jon! And Dick, over the Daily Dose of Excel wants to make sure that your state is automatically highlighted (Ego Charts). Nice “quarter step”!

{ 22 comments }

No, traditional charts are useless  in our complex world

playfair-piechartOver the next 25 years, we will need new visualization tools to replace traditional charts.

As you know, line, bar and even pie charts first appeared 200 years ago, with William Playfair, and perhaps until 25 years ago, they were good enough helping us to make sense of our data. Before computers, they were crafted by graphic designers. Kids in schools drew them using millimetric paper.

Lotus 123 and Harvard Graphics were the most popular charting tools in the early days of personal computers. With those tools (and later, with Excel), the charting landscape changed forever. Some charts vanished, either because they weren’t simple enough and/or didn’t make it into the chart gallery (I miss trilinear plots – yes, Jon, I know how to create them in Excel, but still…), while others should never have been allowed into that gallery.

[click to continue…]

{ 7 comments }

While playing with some county-level data, I stumbled upon what seem to be a secret message hidden in a bubble chart:

triangle-cook-county

Just call me paranoid, but let me ask you this: what is a triangle doing there and why on Earth would the hole in the middle point exactly to Cook County IL, home of President Barak Obama? Isn’t that weird?

I decided to dig a little deeper and visited Cook County’s website and look what I have found in the home page:

dreadful-pie

A sophisticated device that looks like a dreadful pie chart!

Who could be using Excel charts to send secret messages? Are these charts connected, two pieces of a dangerous puzzle? What are their intentions? I must find out.

Someone is knocking on my door…

(Well, this is just another bug in Excel 2007… In my more serious next post I’ll answer this question: “Are traditional charts dying?”. Stay tuned!)

{ 1 comment }

Now is the winter of our discontent
Made glorious summer by this sun of York;
And all the clouds that lour’d upon our house
In the deep bosom of the ocean buried.

By the way, black & white is also a great starting point for better charts.

{ 6 comments }

Hans Rosling

by Jorge

Hans Rosling was here in Lisbon today, for one of his remarkable presentations. It seems that almost no one in the room new about his TED talks and, of course, everyone loved his charts. He gave his presentation in Portuguese, so some extra points there too…

If you just return to planet Earth and don’t know who Hans Rosling is let me briefly discuss his role in the information visualization field.

Professor Hans Rosling became well-known around three year ago because of his remarkable presentation at TED (you can find it here). He was invited again next year and in his new presentation his slogan “seemingly impossible is possible” is defined in a memorable ending (you must see for yourself).

Rosling co-founded Gapminder, “a non-profit venture promoting sustainable global development and achievement of the United Nations Millennium Development Goals by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels” (from the About page).

Gapminder developed Trendalyzer, a charting tool that basically shows a time series in an animated bubble chart. Audiences love that, and with Rosling describing what is happening it is a quite impressive experience.

After the TED presentations Google acquired Trendalyzer and a striped down version can be used in the Google spreadsheet. A while back I used it to display population trends. Click on the image belowto open the chart.

Dependencies Young vs. Old

Dollar Street: Life Behind Statistics

Another interesting application created by Gapminder is Dollar Street (you can download it here). We often are unaware of reality hidden behind statistics. For example, how do people live with less than a dollar a day? In Dollar Street, you can select a house (an income level) and you can see photo-panoramas of each room in the household. There is a talk at Google TechTalks by Rosling’s son, Ola, where he presents Dollar Street.

Hans Rosling and Al Gore

It is interesting to note that, while Hans Rosling became famous because of his TED presentations using a new charting tool, Al Gore was awarded the Nobel Peace Prize. Both men use visuals extensively to raise awareness on issues like poverty and global warming. Al Gore uses a presentation software (Keynote) and his presentations were designed by Duarte. You can see Al Gore at TED here.

Hans Rosling and Business Visualization

Several of my co-workers wanted to discuss with me the use of Trendalyzer-like charts in their presentations. Trendalyzer creates very eye-catching charts, so that’s understandable.

I had to explain that, while these displays are much better than the usual Powerpoint presentations, they need a fairly long time series and and there must be some kind of global trend. Wasting time looking at bubbles jumping up and down is not exactly my idea of fun (or work). And they will become boring.

More important than that: must organizations don’t really know what information visualization is about. They don’t know how to use charts to find actionable patterns in the data. They don’t know how to use charts to communicate those patterns. They are handcuffed to the 3D flying and exploded pie chart paradigm. Replacing that paradigm needs a clear assessment of corporate needs, a long term commitment, a definition of best practices and, obviously, some training. In the absence of these, animated bubbles are just a new fad.

{ 0 comments }

clown

Why do people insist on using “professional looking charts” in their presentations? If I wanted to divert the audience’s attention from the data, I would get a professional clown suit, instead. I would look professional. Not exactly the professional-looking presenter people expect in a corporate environment, but nevertheless a professional.

Meet professional-looking Mr. and Mrs. Gulliver and their Lilliputian friends (courtesy from SmartDraw):

bad-bar-population-chart

(This may be a pet peeve of mine, but whenever I hear the expression “professional-looking charts” I reach for my Browning.)

{ 15 comments }

1. Tufte, the Father of Eye-Candy Charts

Tufte’s The Visual Display of Quantitative Information, published in 1983, is probably the most influential book in the history of data visualization, and it is likely to remain so for some more time.

In his book, Tufte outlines for the first time a consistent theory of how a chart object should look like and why it should look like that. His guidelines are easy to understand and very quotable, not buried under six feet of abstractions. Think of well-known concepts like “data-ink ratio”, “data-density” or “chartjunk”: they all come from The Visual Display

However, too often these principles are taken as self-evident, somehow “discovered”, not invented. A fundamental clarification must be made: these are aesthetic principles that Tufte transposes (brilliantly) from Ludwig Mies van der Rohe’s minimalism to the field of data visualization. These are not universal principles backed up by scientific evidence. Some studies find them helpful, some studies say they are irrelevant, but their effectiveness is hard to measure and they should not be taken as indisputable laws (I call this the “what-would-tufte-say syndrome“).

Unlike other authors (Jacques Bertin, Tukey, William Cleveland), Tufte recognizes that only an aesthetic framework can structure the image (color management, the role of non-data objects, how to emphasize/de-emphasize elements in a chart…). This is clearly the realm of graphic design.

Using aesthetics to improve function is probably the major contribution of Edward Tufte to the display of quantitative information. Unfortunately, this idea that a chart can be an aesthetically pleasing object (“Beautiful Evidence”, the title of his latest book, says it all) went astray and gave birth to a whole industry of eye-candy visualization tools.

From Tufte’s positivist point of view, a chart is defined by how well it makes a pattern stand out. It may be boring but, if that is the case, then “you’ve got the wrong numbers”. His faith in human rationality is both charming and frightening…

2. Patterns, patterns, patterns. And something else.

There are so many misconceptions  to discuss about data visualization that we often forget to emphasize this simple true: data visualization is about pattern discovery, finding useful, actionable visual patterns hidden in the data and make them stand out. Let me repeat: it’s all about visual patterns.

Tufte would agree, but here is the fun part: there is nothing wrong with using 3D effects, textures, and all the decoration in the world. Use them! It is your good taste against Tufte’s. You don’t have to like minimalism. Add color, clipart, anything that you think can engage your audience.

I am not kidding. It’s you, not Tufte, who defines your aesthetic program. Almost anything goes. But, whatever you do:

  • Don’t design technically incorrect charts: do not distort a circle, do not use more than one series in a pie chart, do not make an object variate in two dimensions when you are using a single series, etc. Just common sense, really. And, of course, if you want to break the rules, know them first.
  • Don’t hide the patterns: find the patterns and make them visible. Remove everything except the series themselves. Now start embellishing your chart. But remember: every little thing you add multiplies the clutter and makes the patterns harder to see. You’ll have to find that point where the impact of eye-catching decoration on pattern visualization goes beyond an acceptable threshold.

Please note that minimalism was not randomly chosen. Not only it makes pattern discovery much simpler but also creates a framework to evaluate what belongs to the chart and what doesn’t belong. You can reject it, but if you don’t have a different framework you must decide on an ad hoc basis. Unless you are an accomplished graphic designer (and even then), a minimalist approach is a good start and it should help you to find your own style.

3. Emotions, Emotions, Emotions

Let’s face it: you don’t have much choice. If you do not want to sacrifice patterns, the amount of of decoration that you can actually use is very limited.

So, what do you do with that limited amount of decoration? Essentially you’ll try to create the right emotional response. This is not what you would expect from a over-positivist chart that you end up with by choosing the minimalist path.

Refusing to acknowledge the role of emotions in data visualization is a bizarre thing, considering that you can’t remove aesthetics from the equation, and we all have an emotional sense of Beauty. What many hardcore Tufte fans may consider chartjunk can actually keep the audience from turning the page.

4. Edward Tufte and Excel

Throughout his books, Tufte often refers to the higher resolution of paper, and how it outperforms the current screen resolutions. His sparklines are meant to be printed, because only then the fine details can be observed.

In Edward Tufte’s vision, each chart is unique, and deserves the attention of a work of art. He despises PowerPoint and hardly mentions Excel. His charting tool is Adobe Illustrator, where he is in full control of each small detail. He admonishes against patronizing the readers, but he never really discusses the audience as something that should be taken into account when designing a chart.

5. Knowledge Is Built by the User

matrixpermutator

Much as changed in the last 27 years and you may think that Tufte’s The Visual Display… emphasizes the use of paper just because the extraordinary changes in information technology were still in their infancy back in 1983.

Thing is, that’s not the reason. The real reason is that Tufte always thought of a chart as a final product to be printed and handed to the audience, not something that could be manipulated by the audience.

There is a striking difference between Edward Tufte and Jacques Bertin. Bertin’s “reorderable matrix” is dynamic by definition, and and one of my preferred quotes summarizes perfectly his views:

“It is the internal mobility of the image which characterizes modern graphics. A graphic is no longer ‘drawn’ once and for all; it is ‘constructed’ and reconstructed (manipulated) until all the relationships which lie within it have been perceived.”

This was written in 1967, long before the PC was even imagined. Edward Tufte wants to design an efficient but elegant chart, Bertin wants to solve a business problem. There is no contradiction, one is not better than the other. They just serve different masters. (The image above is from Bertin’s Graphic Semiology and shows how a “dynamic chart” looked in 1967…)

Forty years have passed, but a vast majority of data users have no access to dynamic charts, either because they don’t have access to the right charting tools or they are unable to create those charts using their current tools (it is not that easy for a beginner to create a dynamic chart in Excel).

6. The Life Span of a Business Chart

In his essay “The Cognitive Style of PowerPoint” Edward Tufte argues that the tool itself is intrinsically flawed. I agree with him. Tools are not neutral. They can be forced to do things against their will, but that’s never easy. You can create a dynamic chart in Excel, but it is difficult. You can even force Excel to work like Tableau, but that’s like reinventing the wheel. You can create good chart in Crystal Xcelsius, but that’s against its nature.

The point is, you can apply Edward Tufte’s principles by the book, but that means spending hours perfecting a chart in Illustrator and then printing it. I’d love to. Unfortunately, that’s not exactly how things work in a business environment. The life span of a business chart is short and the time to create it, even shorter. We cannot use Illustrator to create business charts.

7. Take-Away Points

Break away from Edward Tufte, but make sure you know why. Add emotion to your charts (rationally). Decide if the level of eye-candy your audience needs goes beyond what you are willing to add. Other things been equal, an interactive chart should need less eye-candy than a static one. Above all, show the patterns (but make sure your audience wants to see them).

{ 20 comments }