Up in the cloud-piercing reaches of the New York Times Building, Jer Thorp is breezing through the particulars of a grand experiment in data visualization, one designed to unpack the mysteries of social media for the Times and, potentially, newspapers everywhere. It’s ambitious stuff. But right now, what the experiment looks like, projected on a screen in the crisp white hallway of the R&D department on the 28th floor, is a giant galaxy of tweeting rabbis.
You’ve got @RabbiRuth, @RabbiGoldberg, and @velveteenrabbi (cute) speckled like intergalactic dust around rings that turn the Twitterverse into a stunning 3-D representation. The users are responding to a Times story by Paul Vitello about priest and pastor burnout. What you see here is a visualization of how the story has spread through one corner of Twitter, on a granular level (@RabbiGoldberg: "Another reason to take it easy on your rabbi") and at a galactic scale.
Cascade takes a single social-media event and shows the resulting chain of reactions.
This is the work of Project Cascade, and both the mining of the data and the artistic presentation could go a long way toward solving the riddle of how a story goes viral. That could have major implications for newspapers trying to cope in the social-media age. Says Michael Zimbalist, VP of research & development at the New York Times Company: "Our hypothesis has been and continues to be that those organizations that can be responsive to data in real time and make business decisions surrounding that data in real time will be the organizations that prosper in the 21st century." Thus was born journalism’s most ambitious social-media data viz to date, one that already offers some surprising insights into how stories propagate on the web.[Cascade in action, for a story on paring down your material possessions]
Unlike most Twitter visualizations, which present a fuzzy snapshot of Twitter’s reaction to something in the news cycle, Cascade is both dynamic and exceedingly detailed. "What we wanted to do was see the sharing activity happening in a really active way," says Thorp, a digital-arts whiz who developed Cascade as the Times’s official data artist in residence with Mark Hansen, a UCLA stats prof, and Jake Porway, the staff data scientist (and Hansen's former student). In short, they wanted to show how readers engage with stories — info that social-media editors and others could then use to program tweets to bolster their audience.
So Cascade takes an isolated social-media event, like a tweet, and shows the entire chain of reactions that results –- what Thorp and his colleagues call a Twitter "cascade." It can do this in real time. And it can tell you not just that a story caught fire, but how, exactly, the story caught fire; how a tweet from a network scientist made the article "But Will It Make You Happy?" on cutting back one’s material possessions go viral. "We look at four different large data sets," says Zimbalist, who’s overseeing the development of Cascade. "We look at our weblogs: all the stories on our website. Also: The events where people shorten our links. Then we look at how links are shared and propagate all through Twitter. Then we look at the clicks on the links. So we see this complete life cycle of how information spreads. All based on a single event –- a single tweet or a single link shortening."
Cascade holds major promise for newspapers trying to cope with social media.
The tool lets you explore the data from various perspectives. A side view shows Twitter activity over time. A "radar view" shows the distance between Twitter conversations, for instance: how far removed the network scientist’s original tweet was from the Zappos.com retweet that sprang the story loose. But the really exciting view is in 3-D, which combines both the timeline and the degrees of separation for a strikingly clear image of something inherently complex: the complete eco system of a Twitter conversation. (To reduce some of the complexity, a "pruning" mechanism lets you hack away all but the most influential branches of the Twitter cascade.)[Cascade mapping the spread of a story about the Clintons]
These cascades can then be picked apart and analyzed to glean all sorts of insight into reader behavior, which, in turn, could help the Times better promote stories. The visualizations could, for instance, shed light on how best to word a tweet: Does a question work better than a headline? Does a tweet from a story’s author have more traction than a tweet from, say Media Decoder? Should tweets be timed meticulously, as if on a train schedule?
The in 3-D view shows the complete ecosystem of a Twitter conversation.
Cascade’s developers are quick to point out that the project is not designed to dictate how the newsroom approaches coverage. "We would never shape the content of the news around it," Porway says. Getting a clearer idea of how Twitter behaves, the theory goes, will help shorten the route stories take to find interested readers — something newspapers have been doing at least since they puts kids on bicycles to chuck their product through your living-room window. "One thing we can pretty well anticipate is we’ll be able to recommend some best practices [on Twitter feeds]," Zimbalist says.
Cascade is still in R&D mode –- all data so far reflect just two weeks of news stories in August –- so the Times hasn’t reached any definitive conclusions yet. But they’ve already alit on some interesting findings. One: Twitter isn’t as fleeting as it seems. As Zimbalist tells it, referencing the "But Will It Make You Happy?" story: "One of the first and biggest 'aha' moments in all of this was that there’s sort of this feeling among people that Twitter is ephemeral. But in that story, it was a day or beyond later that we got that burst of activity [through Zappos] of sharing. So one of the key questions is maximizing interest over time by pulling these levers we can control."
Twitter influence isn’t necessarily where you’d expect.
Another discovery: Twitter influence isn’t necessarily where you’d expect. As the cascades have pointed up, it’s often people with famous followers, not the famous people themselves, who have more sway. (One possible explanation is that people assume everyone’s seen an Ashton Kutcher tweet and don’t bother retweeting.) It’s also communities –- like the rabbis -– that gather around specific topics and news stories that help drive traffic. "Because we come back and see who reads the stories and not just who posts on Twitter like most other data visualizations, we can really identify who’s influential in a meaningful way," Porway says. "Not just who spams a million followers or who sends out a lot of tweets, but who actually gets people to come back." In the future, the Times could try to incorporate these influential tweeters into their coverage –- in effect, to leverage their power to spread the word.
The next step: "In the next few weeks, we’ll try to turn it into real-time information and give it to several internal constituencies who want to use it –- social media people in the newsroom and people in analytics and marketing," Zimbalist says. If all goes smoothly, the Times might extend the project to other New York Times properties like the Globe and Boston.com. It might even turn Cascade into a broader commercial venture. "Whether you’re a journalist or in the marketing department, increasing readership is the common aim of everyone," Zimbalist says. "Hypothetically down the road we could think about whether this could turn into a useful tool [for other organizations] on how to use Twitter to maximize the audiences they’re aggregating." Newspapers innovating on social media — that’s a man-bites-dog story if there ever was one.