clock menu more-arrow no yes mobile

Filed under:

How Vox aggregates

Matt Cardy/Getty Images

I started as a blogger in the pre-social web, when the only way to build an audience was to have other sites quote or link to your work. Those links didn't drive a ton of traffic back to the original site, but they drove some, and sometimes you would get a new regular reader out of the deal. And that was basically how my career began. Everything I wrote, I wrote in the hopes that someone else would take it and try to use it on their site, with a link back to my site.

The lesson of that, to me, was that writing on the internet is a positive-sum endeavor: I was creating content that helped other people make their sites better, and in using that content, they were helping me grow my site.

Vox's approach to aggregation — which Nate Silver criticized today on Twitter— is informed by that. Our policy, to our staff, is simple: any time we use work created by someone else, we need clear attribution to the original author and a link back to the source. When appropriate, we should do more than that: we should add to the conversation with new facts, ideas, or reporting.

The problem comes when we do it poorly — and in those cases, we deserve to get called out.

Take the post that frustrated Silver. The attribution there was clear. The first line reads, "Nate Silver and his team at FiveThirtyEight put together this great graphic summarizing the popularity of various key political players and how well-known they are to the general public."

The post went on to argue with Silver's interpretation of the data — this wasn't just aggregation, it was actually a debate. We were arguing a point where we thought it would be strange not to include Silver's graphic. In addition, the graphic itself included a FiveThirtyEight watermark, so if someone had looked at it without reading any of the words, they would still know the source.

But the post didn't include a link. This was carelessness, not malice, but it's a violation of Vox's internal standards. Our policy requires attribution, and any time we fail that policy is inexcusable. It's a betrayal of what makes the web positive-sum. Silver's right to be upset by it. He has my apologies.

Beyond today, though, we get a fair number of questions about our approach to aggregation generally, so I think it's worth laying out our thinking for how we aggregate, why we aggregate, and what we're trying to achieve.

Ye olde aggregation

Though it's often portrayed as an internet-age phenomenon, aggregation has been around a whole lot longer than Google.

Time magazine, for instance, began its life as an aggregation shop. It promised, on behalf of the busy American, to comb through "every magazine and newspaper of note in the world." If you read Alan Brinkley's biography of Henry Luce, you'll find a furor that feels very familiar: Time's practices, Brinkley writes, aroused "the increasing dismay of the journalistic community, which had ignored the borrowing when Time was obscure and unknown but which sometimes complained loudly once the magazine was a success."

Today, pretty much everyone aggregates, at least sometimes. When I was at the Washington Post, I remember Bob Woodward publishing an incredible piece of reporting on the effort to find, and kill, Osama bin Laden. Buried deep in the piece was a wonderful anecdote: when the Navy SEALs killed bin Laden, they found themselves scrambling to somehow measure his height. Eventually, one of them laid down next to the corpse to serve as a comparison. Obama later asked aides, "Could we not afford to buy a tape measure?"

The New York Times plucked out Obama's comment — a detail that was the fruit of an enormous amount of reporting — and blogged it, alongside the coda that Obama later presented Vice Admiral William McRaven with a plaque featuring a tape measure.

And speaking of the Washington Post, during my time there I helped to create Know More, a site dedicated to aggregating in a more ethical way. We wanted a way to make clear that even when something is aggregated well, that doesn't mean there's not much more information at the source. So the idea was that each piece of content would come with a big "Know More" button that would lead people back to the original source to, well, learn more. I'm enormously proud of the project, and it's still going strong today.

The value of aggregation

All these publications — and pretty much every publication you can think of — aggregate for the same reason: because it's of value to their readers. Because the world is way too big for any of us to cover on our own.

Oftentimes aggregation happens less publicly, too: all reporters, all the time, are writing articles informed by other people's work, ideas, and theories. Most all intellectual endeavors end up being a pastiche of good points made by people who came before you — if you're lucky, you can add a little bit to the whole, and help those who come after.

But while aggregation has always been a clear service to readers, it can be enormously frustrating to writers. I've worked for months on a piece only to find the best bits summarized elsewhere, under a less responsible headline — and then watched the Facebook shares tick past the original.

But aggregation, when done correctly, offers value to the original source: it sends readers back to the first article, and because of the way Google's algorithm works, it increases the search ranking of that article, as well.

This informs Vox's policies around being aggregated. We want people to link back to our work, to discuss our work, to use our graphics and embed our videos. For that reason, our graphics come with a logo at the base, and our videos are festooned with branding.

Please embed me!

We want people to talk about our work, put it elsewhere, spread the Vox word. We're currently working on products that will make it even easier for other sites to use our work.

Silver also raises a question about a newer, and less traditional, experiment: VoxMaps. This is somewhat inspired by the success of Twitter accounts like Amazing Maps and UberFacts, both of which I enjoy but find frustratingly unsourced. So the idea was to create an account that would push out our coverage of graphics — both those we make and those we don't — with links back to the original posts, as that's where our analysis is, and that's where the sourcing information is.

Silver is frustrated that the account tweets out posts that use maps that sites other than Vox created, with links back to the Vox pieces that discuss them. I think we're driving readers to those sites, and recognition to those maps, and we often are thanked by the people who made the maps in the first place. I think the account is both better for its followers and does more for others' sites by including non-Vox graphics. But it's a new experiment, and maybe we haven't gotten the mix right. I very much welcome feedback on it.

But a final, broader point worth making is that there's a school of thought online that the only ethical approach is to remake, redo, or re-report data visualizations that you want to discuss. So if there's a great Gallup chart and you want to use the information, you should remake the chart and slap your own logo on there.

That approach makes me queasy — it seems like a way to hide the original intellectual authorship. Often what's hardest in this work isn't creating the chart but coming up with the idea for the chart, and so I want to make sure, where possible, that we're giving people credit for their ideas, even if it would be easy for us to remake their work. That doesn't mean we never make our own versions of charts others have thought of, but we try to have an editorially compelling reason to do so, and to credit generously when we do.

But this stuff is complex, and we don't always get it right. So if you ever feel Vox isn't using your work in the way you'd want, email me at and let me know. Our intention is always to do things in a way that is positive-sum, and if you ever feel we're failing that ideal, we want to know, and we'll work with you to change it.