Principle 5 - Don't Take A Data Dump On Your Audience

Have you ever noticed that after some talks many people are eager to ask questions, but after other talks the audience is silent and everyone seems itching to get out of the room? There are at least two reasons for the itchy pants phenomenon. One is that the talk was boring and the the speaker has failed to engage the audience in the exciting elements of the science. I have discussed some remedies to the boring talk in previous posts. A second kind of problematic talk is the polar opposite: the take-no-prisoners talk, in which the speaker tells us everything they have done in a tsunami of slides. This is really a data dump talk. But, data dumps can take many incarnations. The take-no-prisoners talk is one extreme. I also see data dumps on a smaller scale. Sometimes a single slide can be a data dump!

I have never seen a data dump that improved a talk and most data dumps thwart conversation. Data dumps present the material in such a shallow manner that the audience has no idea how to begin thinking about the conceptual issues raised by the work. This is my explanation for why, most of the time, audience members have few questions at the end of a data dump talk. The speaker has provided no window into the science, just a facade of sciency things. Sometimes, when I am feeling really cynical, I think that speakers use data dumps to intentionally thwart conversation, by throwing up bulwarks to hide chinks in the armor. But, even if this is not the case, it is hard to know, because the data and ideas fly by so fast!

The worst style of data dump involves presentation of an undigestible quantity of data in a short time. I often observe extremely successful scientists giving data-dump talks—the take-no-prisoners talk—where they ever so briefly mention each of the projects going on in their lab, together with a historical review of everything they have ever done. When presented with excellent rhetoric and flair, these talks can seem very impressive. But, they are essentially advertising. While advertising can be entertaining and enjoyable, it can also be misleading and corrosive to understanding. Fundamentally, these are not science talks. They are not talks that invite understanding and a conversation about the science.

Data dumps appear commonly in genomics and other 'omics talks. 'Omics might be defined as the process of implementing a technological innovation to produce a ridiculous quantity of data. Because nobody can digest a vast amount of data quickly, especially when presented in a "novel" graphic form, 'omicists therefore bear a particular burden to explain their work slowly, carefully, and clearly. Instead, I often observe 'omics talks with slides flying by, each slide containing a different graphical representation of the 'omics data, and the speaker glibly rattling off one conclusion after another. The audience has no chance to evaluate the data and, therefore, is in no position to assess the conclusions. Of course, I have seen some 'omics talks that were beautifully presented. In such cases, the 'omics data were presented after a clear explanation of the biological question being addressed and the manner in which the 'omics data could be used to answer the question. And then, the 'omics data were presented in a deliberate and cautious manner, together with a discussion of the limitations of the methodology. In the few cases were I have seen this done well, I felt like I really learned something, and I had a ton of questions after the talk.

I have observed that labs that build tools for other scientists often present data dump talks. Often, the best tool builders are very prolific and their many tools really are very useful to a large community of scientists. Therefore, they often use a talk as an opportunity to advertise the existence of all their great tools. These talks are more infomercial than science talk. The problem with this is that a laundry list of great tools provides little insight into the process of how tools are built or the creativity and scientific ingenuity that underlies the work. These scientists keep the science veiled in mystery, rather than bringing other scientists behind the scenes and engaging everyone in a deeper understanding of the science of tool building. I recently saw a young acolyte of a great tool-building lab give a mini version of a data dump talk. Predictably, the audience was stumped for questions. Everyone assumed that building the tool involved a rather boring sequence of protein engineering steps. I was convinced, however, that there must be interesting science buried in the story. It was only under cross examination that the speaker told us that, in fact, the project was stalled for years because they couldn't figure out how to make the complex protein fusion stable. It was only when they thought to use the homologous protein isolated from a thermophilic bacterium that the protein was stabilized and the project started to work, leading to the great tool we were shown. A deep understanding of biology, a leap of insight, and voilá, we have a tool that nobody thought could be built. That is an exciting story that raises more questions than it answers. That's a talk I want to hear.

For these most egregious styles of data dumps, the remedy is very simple, just throw out most of the topics. Speakers should limit themselves to just one or two major topics for each seminar and take care to present the data and ideas underlying each topic in a cohesive story. 

Data dumps also occur on smaller scales; more like little rabbit data dumps, or data dumplings, rather than the big bear data dumps I discussed above. A data dumpling can be something as simple as showing a figure of data without explaining the axes and without taking the audience through the data. That is, anytime you show data and draw a conclusion, without walking the audience through the data, you have taken a little data dump on the audience. Data dumplings are extremely audience dependent. For example, a phylogeny of species is a very data rich image that is easy for many evolutionary biologists to absorb quickly. However, scientists in other fields may need a slower introduction to the meaning of branch points, branch lengths, and other features of a tree. Likewise, an audience of neurobioloogist will have no problem interpreting a rich dataset of spike trains, but most people outside the field will need a more gentle introduction to the data. Molecular biologists will scan a Western blot quickly, just as ecologists will quickly appreciate a species area curve; but you will need a little time to explain each to the opposite audience. 

These observations have led many people to offer the advice that speakers should "know your audience" before preparing a talk. This is terrible advice that has contributed to the balkanization of science. It is much better to assume that you don't know anything about your audience (except that they are scientists) and that they don't know anything about the topic you will present. If you start with that premise, you will quickly realize that you must explain your goals and approaches in plain language, that you should limit how many topics you will discuss (avoiding the big bear data dumps), and that you should carefully prepare how you will present and discuss the key data to communicate your most important messages (thus avoiding the little rabbit data dumps). Maybe someday I will write the Principle "Don't know your audience," but for now this short explanation will have to suffice.