Textures of Complex Data: The works of Fernanda Viégas & Martin Wattenberg

Fernanda Viégas and Martin Wattenberg opened Columbia University’s Talk Series: Artists using Data in late March.

Messiness, Clutter and Revelation

As pioneers in data visualisation, analytics and data art, Viégas and Wattenberg have paved new pathways for users to understand and explore data.

As technologists we ask: Can visualization help people think collectively and move us beyond numbers into the realm of words and images and never-before-told stories? As artists we seek the joy of revelation.

While most of their current work is geared towards AI as part of Google Brain, Viégas and Wattenberg began by walking their audience through one of their earliest projects using Google called ‘Web Seer’ that allows users to compare Google Suggest completions.

This is also a really interesting window into public psyche. Because, this is what people are coming to google for. It visualizes exactly the same data but by adding a couple of dimensions. So, we get a sense of which ones are more popular. We can see the completions that are different for each one of the cases. But, we can also see what they have in common.

You see a richness in these kind of data sets. It also starts to show how vulnerable some people are when they come to google for answers.

Viégas

History Flow

Tying into the idea of data created by the masses, the artists unfolded processes that went into their notable ‘History Flow Tool’ which, visualized the behind-the-scenes dynamics of publicly edited Wikipedia pages in 2004, when the online encyclopedia was a relatively new and mysterious place on the web. Viégas prospected commonly overlooked occurrences on Wikipedia like vandalism, watch-listing, edit wars and disambiguation that go unnoticed due to the sheer Web 2.0 speed at which the giant encyclopedia gets edited.

Article on Abortion, Image Courtesy: hint.fm

Wikipedia fosters a knowledge community built upon trust. An interesting feature that the artists discovered in their process were Watch lists. Something that we commonly seem to be unaware of. Watch lists on article topics help active Wikipedia contributors take notice of vandalism. Every time an article of their interest is edited, contributors receive notifications. A notification from a new IP address or a user they haven’t seen before would be cause for alarm, wherein the community would check to make sure that it’s not a vandal. A real-time visualisation within the History Flow tool would show no discontinuities. 

An article on Cat tends to be longer than a lot of other articles such as ‘Design’, as more people edit ‘Cat’. The visualisation of the history of the ‘Abortion’ page would have distinct discontinuities, reflective of the polarized opinions around that topic. The artists also colored the text based on its age instead of the authors to determine parts of an article that could be posited as qualitatively more stable.

We were really interested in how people were negotiating in this sphere, how were they deciding what fits and what doesn’t fit. Questions like these, were out first exploration into how these collaboration dynamics work.

History Flow became a part of MoMA’s collection in 2003.

Seeing Music

Image Courtesy: bewitched.com

Image Courtesy: hint.fm

An attempt to extract structure from longer pieces of music, Wattenberg’s ‘The Shape of Song’ from 2011 looked at notes of repetitions from classical music and folk songs to Jazz and Led Zeppelin’s Stairway to Heaven.

Jazz is actually quite interesting. You get something that’s relatively simple at the beginning, which then explodes into complexity towards the end. This, to me, is actually capturing something visually that you can otherwise only hear.

Flickr Flow

In 2009, a Boston-based print magazine ‘The Positive Things’ commissioned Viégas and Wattenberg’s piece Flickr Flow with a brief to visualize Boston. The artists turned to the photo-sharing website Flickr for a year’s worth of creative-commons images of Boston Common, a central public park in downtown Boston, Massachusetts – with the intent to capture Boston’s visual dimension through its seasonality. The images were organized by months and parsed through for different kinds of reds, greens and so on, counting pixels for each image, which became raw data for drawing the ribbons.

 

This is a very “dirty” data set if you will, because these were not all going to be beautiful pictures. There would be pictures of benches, for example and other things that have nothing to do with flowers or foliage. But, we decided to work with the messiness and see if we can get somewhere.

Even with all the messiness in the data, there was still some signal that there is change. In fact, this looks very fluid. But if we break it down into the height of each season, you can see that the color distribution is dramatically different between winter, fall, summer and spring.

Art of Reproduction

Playing with the idea of visual half-truths, Watternberg’s Art of Reproduction was a collection of fragmented collages of famous artworks representing dramatic differences across the reproduced images.

Not all of these images are really the correct image at all. For one thing, they are different sizes. But, more deeply, the colors are different. And if you keep looking, you realize just how broad the variation is. We all know that reproductions are not the same as the original on some level. But, seeing the breadth of these different things is impressive. 

The Wind Map 

In 2012, the artists pioneered a distinctive way of visualizing the wind – something that has virtually no visual form. Working with government data of the United States, they initially began by conceiving of wind as ‘particles that we see as a pattern’. Eventually, they settled on the idea of particles that would leave behind little trails, which allowed for communicating subtler forms information such as change in direction.

Image Courtesy: hint.fm

When Hurricane Isaac made landfall in August 2012, the artists began receiving emails from people affected by the natural disaster.

It was a very strong experience to have something on the web that is real-time that people were looking at for very different reasons and that we had people in these very specific situations talking to you about the data that you’re visualizing.

Image Courtesy: hint.fm

When working with data such as this, designers tend to aggregate in turn obfuscating a lot of the detail. Viégas and Wattenberg, instead emphasize the texture and richness of the data relying on the viewer’s visual system and intuitive understanding of the difference between ‘broad patterns of wind versus delicate things’. This particular map came to be used professionally by farmers, and scientists who observed bird migrations and butterfly migrations, and teachers and school children to learn forecasting. Cameron Beccario, a software engineer adapted this tool to scale it to the entire earth at different levels going up to the stratosphere, creating greater accessibility to the data for purposes such as aerial navigation.

There were a lot of decisions we made in this visualisation – design decisions. We’re not using color, for instance. We’re not showing pressure or temperature. We’re not drawing (geopolitical) boundaries on the map. We wanted this to be as unobtrusive as possible. We wanted you to see the shape because that’s what we wanted to see and then, people started using it in really unexpected ways.

It speaks to the power of just making complex data easily accessible. How can you make anyone digest and interact with complex data. This is one of the aspects of data visualisation that’s near and dear to us.

The Wind Map became a part of MoMA’s collection in 2012.

What Improv Storytelling has to offer to Data Artists

In 2015, Ben Wellington gave a TEDx talk on how he borrowed principles from his lifelong love for Improv Comedy and applied it to his Data Visualization practice. “I accidentally became a data storyteller,” he says.

“The Open Data Laws are really exciting for people like me because it takes data that is inside City Government, and suddenly allows anyone to look at it.”

The narrative that came out of contextualizing this data spotted zones that fervent NYC cyclists are better off avoiding and shed some light on the battle strategies of new yorkers’ favorite pharmacies. Wellington closes the distance between Data Viz and Improv by ‘Connecting with People’s Experiences’ and ‘Conveying one simple (and powerful) idea at a time’.

Alan Alda, the seven-time emmy winning actor of M*A*S*H along with Ocean and Environmental Scientist and Associate Director at The Alan Alda Center for Communicating Science, Dr. Christine O’Connell experimented with a group of scientists, doctors and engineers in 2016 in a workshop to employ Improv Storytelling in communicating their research.

“I think anybody that studies something so deeply, whether you’re an engineer, whether you’re an artist, whether you’re in business, you forget what it’s like not to know” – O’Connell

Empathy lies at the heart of Improv and therefore, at the heart of good communication. The idea of speaking to your audience and working with them to create a common language and evolve into clarity is especially relevant for Data Scientists and Data Artists.

The Data Artist creates an imaginary, artificial environment not dissimilar to that of an Improv actor where certain cues are visible and certain others have to be made up. The logic of this environment, however, needs to be consistent and is as important as the trust established within it.

“Even small breaks can affect credibility. – When we visualize data, we are (asking our audience to suspend their understanding of reality for a moment and accept new rules and conditions). We are asking our audience to understand shapes and forms on a digital screen to be something other than what they are.” – Ryan Morrill, Storybench, October 2017.

The Data Viz equivalent of Laughter in an Improv Comedy Scene is the deriving of Insight, says Morrill, where the logic reveals a reward.

A History of Data Journalism

We have been using data to explain our world for a long time. Data journalism is no exception. We have, as marketing strategist Andrea Lehr explains, been looking at data to help us tell stories for maybe even longer then we’ve thought. In this interview with Kristen Hare at Poynter, Lehr shares some of the findings from her recent report on the history of data journalism.

When staffers at the marketing agency Fractl decided to look into data journalism, they went way back. Way back. As they note, a kind of data journalism was used in the Han dynasty.

“I was most surprised to learn just how long the concept has been around,” said Andrea Lehr, a strategist at Fractl.

In 1849, for instance, The New York Tribune used a chart to show how many lives were being lost to cholera.

Fractl has seen an increase in data journalism among the publishers it works with, so staffers compiled a report on the storytelling method. The agency also spoke with several data journalists as part of the project, including FiveThirtyEight’s Allison McCann and Nathaniel Lash of the Poynter-owned Tampa Bay Times.

Lehr spoke with Poynter about the report via email.

Read the rest of the interview with Lehr at Poynter

How Data-driven Programs are Reducing Gun Violence

Ted Alcorn at Wired brings us this great piece on how data-driven programs are being used in several US cities as a way to reduce gun violence:

At their core, data tell stories. They reveal patterns, show changes over time, and confirm or challenge our theories. And in cities across the country, mayors, police chiefs, and other local leaders are turning to data to help them understand and address gun violence, one of the most persistent crises they face.

Innovative, data-driven programs are showing encouraging results. To keep high school students on the right track, the city of Chicago scaled up a school-based program called Becoming a Man for seventh through tenth graders living in neighborhoods with high rates of violence. The students reflect on their life goals, observe how their automatic responses inside school and outside school differ, and learn to slow down and react more thoughtfully to these sometimes divergent social environments. An adaptive behavior on the street, like fighting back to develop a reputation of toughness that could deter future victimization, will be maladaptive in other social situations. To test the impact of the program, the University of Chicago Crime Lab built a rigorous evaluation into its rollout. After two years, they were able to show that participants were 50 percent less likely to be arrested for a violent crime than students in a control group, and those students graduated at a rate 19 percent higher than those who did not participate. This close analysis of the program affords new insight into what makes the program work, and how to enhance it and apply it in other settings.

Read the whole article at Wired: One Great Way to Reduce Gun Violence? A Whole Lot of Data