On March 16, 2009, Katrina Kelner, editor of Science Translational Medicine, Managing Editor, Research Journals, Science Magazine gave a talk at NIH which was titled “Publishing in Journals” (sponsored by the NIH Office of Intramural Training & Education). About 7 min into the talk she made a really important point regarding the use of figures: She emphasized that in the past the “printed figure was your data.” However, this was only true in the print publishing-only world. Now, with the advent of digital publishing, the figures in the paper serve only as examples or as an interpretation of the data because there is now the capacity to provide the actual data online for everybody to download and rerun the analysis.
She also talks about image manipulations using Adobe Photoshop and explains what is allowed and what is not allowed. Dr. Kelner shows various examples of gel images/micrographs and what was done to them. Her description includes what are appropriate and what are inappropriate manipulations and what needs to be explicitly declared.
Listen to the enhanced video podcast (mp4, 1:32) or see the overview information about the podcast.
BusinessWeek reports that President Obama has Edward Tufte appointed to help visualizing where the economic stimulus money is going. The article celebrates this decision as a victory for data visualization and for a more transparent government.
It can only be assumed that Tufte received the honor based on his book Envisioning Information which shows displays of high-dimensional complex data, such as maps, charts, scientific presentations, diagrams, computer interfaces, statistical graphics and tables, stereo photographs, guidebooks, courtroom exhibits, timetables, use of color, a pop-up, and many other wonderful displays of information.
Mashable shows a very nice infographic under the headline “State of the Internet Explained In One Giant Infographic.” The interesting twist in this graph is that it does not try to show big numbers or complex percentages, but reduces everything to a random crowd of 100 people. Each person is represented by one circle. All data and relationships are expressed using these circles. However, I am not sure why this metaphor is broken in the middle by displaying an ugly pie chart. Labeling is clear and fonts are well chosen. Well done!
It is not uncommon to find discussion spaces with hundreds to thousands of messages and participants. User-generated content (UGC) is the driving force behind all Web 2.0 applications. How do you visualize such an exchange of ideas?
Today, I found tldr which is “is an application for navigating through large-scale online discussions. The application visualizes structures and patterns within ongoing conversations to let the user browse to content of most interest. In addition to visual overviews, it also incorporates features such as thread summarization, non-linear navigation, multi-dimensional filtering, and various other features that improve the experience of participating in large-discussions.”
Publication about the project: Narayan, Srikanth and Cheshire, Coye - “Not too long to read: The tldr Interface for Exploring and Navigating Large-Scale Discussion Spaces”. The 43rd Annual Hawaii International Conference on System Sciences - Persistent Conversations Track - Jan 2010.
As reported on Mashable and elsewhere, TrendStream, who publishes the Global Web Index, has created an interesting visualization of the penetration of different social technologies in major markets around the globe. The data come from interviews with 32,000 Internet users in 16 countries. The PDF shows labeled pie charts with overlap. Displaying a grayed out 100% pie and then let each pie piece start at the same baseline is certainly a new way of allowing comparisons which are usually hard in pie charts if the differences are not obvious. The legend states that “The size of the arch’s, represents the audience volume in millions.” [Sentence unaltered from source.] The problem I see is that the thickness of the arches is different in order to improve display, but the thickness seems to represent quantity as well which is apparently not the case, or is it? The display implies two dimensions, angle of the arch and thickness of the arch, but the data is only one-dimensional.
It is quite rare that data visualization topic makes it into the headlines, but a recent dispute between data visualization guru Edward Tufte and Microsoft made it into Slashdot. According to Wikipedia, “a Sparkline is a type of information graphic characterized by its small size and data density. Sparklines present trends and variations associated with some measurement, such as average temperature or stock market activity, in a simple and condensed way. Several sparklines are often used together as elements of a small multiple.” Read more about Sparklines in Edward Tufte’s book Beautiful Evidence (page 62).
An entry in Tufte’s blog titled “Microsoft patent claim for ’sparklines in the grid’” outlines the conflict about intellectual property rights resulting from a patent application which had been filed on May 7, 2008 by Microsoft employees, claiming various aspects of Sparklines’ implementation in Excel 2010.
I found a nice example of how to visualize information flow in science. The Eigenfactor Project (data analysis) and Moritz Stefaner (visualization) cooperated on this interactive visualizations which is based on the Eigenfactor™ Metrics and hierarchical clustering in order to explore emerging patterns in citation networks. It shows citation patterns over time, a clustering and maps based approximately 60,000,000 citations from more than 7000 journals over the past decade. Interestingly, a recent blog at BMJ Group blogs by Richard Smith “The beginning of the end for impact factors and journals” discussed the end of impact factors as a measure of research quality and their substitution by article-level metrics (see Next Generation Science)
Recently, I picked up a story in Slashdot which discussed a complicated visualization problem. Mathematicians and physicists work all the time with more-dimensional objects or ideas, but how do you show this to the layman who cannot read the formulas? The videos on Dimensions-Math show some clever tricks to get a feeling for what four dimension are like. The techniques begin by imagining how two-dimensional creatures, like those in Edwin Abbot’s Flatland, could get a feeling for three-dimensional objects. ScienceNews‘ Julie Rehmeyer reports as well about these mathematicians who are freed in their imaginations from physical constraints.
I have seen an intriguing and emotional talk about some statistical data of US citizens. I know, how can a talk about population data be exciting, but it really was! Chris Jordan manages to make these numbers come alive using large-scale visualizations. For instance, he creates a giant poster showing the plastic cups discarded every six hours by US airlines. TED (Technology, Entertainment, Design), describes Chris as one who “runs the numbers on modern American life—making large-format, long-zoom artwork from the most mindblowing data about our stuff.” Here a link to his 11 minute talk. Enjoy!
The widespread use of hue or color to represent quantities in graphs (e.g. blue for 10-20%, green for 21-30% etc.) is a habit that needs restraining. Don Norman* provides a superb explanation for why hue should not be used for displaying quantities:
“… hue is a substitutive representation, and the values of interest are usually additive scales. Hence hue is inappropriate for this purpose. The use of hue often leads to interpretive difficulties. Many colorful scientific graphics, usually generated by a computer, use different hues to represent numerical values. These graphics force the viewer to continually refer to the legend for mapping between the additive scale of interest and the hues. Density, saturation, or brightness would provide a superior representation.”
*More on graphical design principles can be found in Things That Make Us Smart (page 71) by Donald A. Norman. Cambridge, MA: Perseus Books, 1993