Monday, 28 March 2011

A picture is worth a 1000 words

While at a conference in Italy in 2009, I attended a talk on new observations of a nearby spiral galaxy. The speaker presented several interesting results but had to confess she did not have an image of the galaxy, since they were still waiting for data from the Hubble Space Telescope. From across the room of eminent astronomers came a collective sigh of disappointment.

It was hard not to laugh. In fact, I don't think I succeeded. The idea that professionals in the field would bemoan the lack of a pretty picture was deeply amusing; surely we should all be above requiring such frivolities? 

The truth, however, is that visualisation is an intricate part of successful science. Presenting your data in such a way that the main results stand out makes for better communication, without which scientific ideas cannot be shared, tested or accepted. This was the concept behind the "Science Illustrated" conference in Toronto that Masters student, Mikhail Klassen, attended last month and was badgered into talking about at the department's weekly journal club.

Mikhail explained that the conference discussed how the way you present your results can both help and hinder the viewer. Consider, for instance, the block of letters in the image at the top of the page. If you were asked to count the number of occurrences of the letter 'v', it would take you at least a few minutes to carefully examine each line. If instead each 'v' was coloured red, the task becomes a matter of seconds. A more extreme example is that of Anscombe's Quartet which is shown in the bottom half of the image. These four data sets have statistically identical properties, including exactly the same average and spread. If these were actual scientific measurements, a glance down the columns might cause you to think that they were showing the same result. However, if you plot them on a graph, you can see at once that they show completely different trends.

On the other hand, you can also choose to visualise data in a way that confuses the viewer. A famous example of this was a power point slide showing the current situation in Afghanistan. So crowded with interlinked lines was this plot, that General Stanley McChrystal, the US and NATO force commander, remarked dryly:

"When we understand that slide, we’ll have won the war."

A common error, if slightly less extreme than the above example, is to pick a bad colour scheme. Using colours that are similar to one another can obscure the trends you are trying to illustrate. Our brains also have a 'perception priority' when dealing with visual input, placing relative position above colour. This means that if an important result is, for example, the maximum density in your galaxy, it could be that plotting this on a line graph is more effective that colouring an image of the galaxy by density.

Mikhail went on to point out that there is also an ethical side to data presentation. By plotting two quantities against one another to demonstrate a relationship, you are excluding any information about other, possibly important, factors. A non-astrophysical example of this is a reconstruction of the Air France flight 358 that crashed in Toronto in 2005. From a reconstruction of the plane landing, it appears to be a pilot mistake; the plane drifts, touches down too late on the runway and over-shoots to crash into the creek (no fatalities). However, there is no weather information in the movie and eye witnesses report strong rain and winds with terrible visibility. As scientists, it is our duty to state clearly what is and isn't shown in our plots to ensure we do not mislead our audience.

Mikhail's final point from the conference was to remind us that communication of results depends on our audience. If we are presenting our findings to the public, we will be competing with Lady Gaga for their attention! This might lead us to choose difference visualisation techniques than if we were presenting to other astrophysicists. Although, if my experiences in Italy were anything to go by, that isn't necessarily the case.

[Thanks to Mikhail for sharing his (very clear!) slides from his presentation. The bottom right image showing plots from Anscombe's Quartet was taken from wikipedia.]


  1. Thanks, it was really informative. Important points, especially the last paragraph :-) Although I think overloading the slides with motions and effects can also be problematic.

  2. It also brings to my mind that even better than an image is a short video! You know, the nice part of most computational works is that at the end of the day, there would be images and videos, and "who doesn't like videos?" :D In one of my seminars (in Sweden), I had a series of images (around 150) showing--by simulation--the evolution of possible aircraft trajectories over a period of time. My supervisor pointed out the fact that it would be really boring for the audience to look at such number of slightly different images, thus he suggested to pick only four of them (the first, the end, and two from middle). However, I decided to make it more fun by generating a short video using the sequence of all the images, and while playing, I talked about the details of the results. The feedback I received was fantastic!