clock menu more-arrow no yes

Why amateur wine scores are every bit as good as professionals’

If amateurs lacked wine expertise, we would expect to see little or no correlation with professionals. We saw just the opposite.

How do wine enthusiasts compare with the experts like Robert Parker and Jancis Robinson? Very well.
Javier Zarracina

Few consumer products offer as staggering a range of choice as wine. You can buy a bottle of Dark Horse Big Red Blend for $8. Or for around $500, you can get a 2012 bottle of Sloan Proprietary Red. Yet for each bottle, the same question applies: Is it any good?

For decades, Americans turned to professional critics like Robert Parker to help them make that determination. But the internet changed all that.

The rise of the wine-rating crowd

In 2004, Eric LeVine — then a group program manager at Microsoft — launched CellarTracker, a site where amateur wine enthusiasts can rate wines. Today, CellarTracker is the web’s most popular “community” or “crowdsourced” wine review website, containing 6.3 million reviews from 113,000-plus users for more than 2.2 million different wines.

Many professional critics, not surprisingly, have scoffed at the idea that mere amateurs understand, let alone have the ability to rate, wine. In a 2012 column for the website Wine Spectator, critic Matt Kramer described the wisdom of the crowd as a “pernicious delusion.” “One hundred people who don't know much about, say, Auxey-Duresses,” he wrote, "adds up to 100 muddied, baffled and often duplicative conclusions.” Critic Steve Body concurred in a 2014 post titled “Crowd-Sourced Ratings and Why They Suck” on his website ThePourFool: “The readers and users of these sites are almost always slaves to their personal preferences and current trends.”

That’s the standard knock against amateur critics. Compared with the paid professionals — who very often evaluate wines blind — they are untrained, are subject to bias, and lack expertise.

But is that true? In 2016 (when this piece was first published), we decided to test the hypothesis by comparing community wine reviews for California wines from 2014 or earlier with those of professionals by running a simple correlation — a common data analysis tool.

For the professional scores, we purchased memberships to three sites: Wine Advocate (Robert Parker’s site), International Wine Cellar, and Jancis Robinson. We limited ourselves to these three because they had the most wines in common with CellarTracker. (Shortly after we obtained our data set, International Wine Cellar merged with Vinous and now goes by that name.)

Altogether, we obtained scores for 10,679 wines on Wine Advocate, 12,182 on International Wine Cellar, and 1,167 for Jancis Robinson and compared them with 51,689 mean scores for wines on CellarTracker. We then matched the wines and their scores between our select group of experts and the multitude of enthusiasts and analyzed them with a software program called Prism.

Before we say anything more about the results, it’s worth considering what this sort of comparison can and cannot accomplish. In science, this kind of study is called “observational” — you may be able to tease out a relationship, but you won’t know why it’s there. For example, if it turns out the experts and enthusiasts differ significantly in their estimation of wine quality, this does not necessarily mean the experts possess a more refined or objective understanding of wine. It could just as easily be interpreted to mean that wine experts can’t relate to the kind of wine ordinary people find enjoyable.

But that’s not what we found. Nor did we find perfect agreement. We discovered something altogether more interesting.

So let’s start with the big question: How do the wine enthusiasts compare with the experts? Very well. Have a look:

Javier Zarracina/Vox

There are 9,119 dots on this diagram — one dot for each wine rated by both CellarTracker and Wine Advocate. Each dot, furthermore, represents two scores, one by Wine Advocate and the other the mean score on CellarTracker. As you can see, it doesn’t look as though there’s anywhere close to 9,119 dots in this diagram. What really stands out is the big blob in the top right. This is a visual representation of CellarTracker and Wine Advocate agreeing. Most of the CellarTracker scores are very close to, and in many cases identical to, the scores on Wine Advocate.

Amateur and professional wine scores correlate very tightly

How similar? We ran a statistical tool called a Spearman correlation and got a figure of 0.576. A perfect correlation is 1. An utter non-correlation is 0. A score of 0.576 may not sound impressive at first, but it can actually get worse than 0 — a negative correlation, which is what you would see if you compared, say, shortness with the likelihood of playing professional basketball.

Javier Zarracina/Vox

The CellarTracker scores correlated with International Wine Cellar nearly as well, with a value of 0.555.

Javier Zarracina/Vox

And the weakest correlation was with Jancis Robinson, at 0.424.

But to get an even better understanding of just how well CellarTracker relates to the experts, the best thing to do is leave CellarTracker out of it. It’s when you compare the experts with one another that things start to get much more interesting.

Javier Zarracina/Vox

Among experts, the strongest Spearman correlation is between Wine Advocate and International Wine Cellar, which, at 0.568, is still lower than the 0.576 between Wine Advocate and CellarTracker —and it only gets worse from there.

Jancis Robinson and Wine Advocate comes in at 0.208, and Jancis Robinson and International Wine Cellar isn’t much better at 0.222. That’s why the blobs are more diffuse when experts are compared with each other than when they’re compared with CellarTracker. There is less agreement.

What does it all mean?

Javier Zarracina/Vox

Amateurs appear more expert than the experts

It looks very much like the enthusiasts actually do a better job of agreeing with the experts than the experts do with each other. That might sound odd, but out of thousands of wines we analyzed, only a handful contradicted this pattern. Simply put, if you want to know what the experts think, the best place to look appears to be, of all places, CellarTracker.

Why do the scores correlate so well? The data doesn’t tell us that.

It’s possible all those enthusiasts on CellarTracker already knew what Robert Parker and other experts said about each wine and just parroted their scores. That said, when you consider that a 2007 Seghesio Family Vineyards Zinfandel Sonoma County was rated 1,406 times on CellarTracker, for an average score of 90.5, does it really seem likely that 1) they had all read the expert scores, and b) they were consciously or unconsciously swayed as a result?

If so, how likely is that to play out over thousands and thousands of bottles?

It’s worth pointing out that our focus on California also comes with limitations. We did this because California presumably represents a more homogeneous group of wines than would a mix of regions. But we don’t know if the relationship we observed applies to other wine-producing countries or regions, like Spain, Chile, Burgundy, and Oregon. (That would require a larger study.)

And it’s not like the experts and the enthusiasts always agree.

Javier Zarracina/Vox

If you return to the first figure, comparing CellarTracker and Wine Advocate, you’ll see numerous dots that aren’t part of the main cluster. For example, on the bottom row, there is a single dot just to the left of 90, at 88. If you look over to the y-axis, you can see that it lands on 52. The wine in question is the 1999 Testarossa Chardonnay Sleepy Hollow, and it appears to be a case of serious disagreement — a difference in score of 36 points. Wine Advocate gives the wine a score that, according to its own scoring system, is between “above average” and “very good,” while CellarTracker says it’s terrible.

Upon further inspection of the data, however, we noticed that only a single CellarTracker user scored 1999 Testarossa Chardonnay Sleepy Hollow, representing an extremely small sample size. In fact, correlations between amateur and expert only become stronger as the quantity of community reviewers increases. The larger the number of enthusiasts, the more the scores match those of the experts. Cases of strong disagreement, however interesting they may appear, are extremely infrequent.

The better the wine, the more experts agree with the amateurs

There is also a tendency for scores to converge as wines improve in quality. This is evident in the arrow shape of the clusters in figures comparing CellarTracker with Wine Advocate and CellarTracker with International Wine Cellar. (Notice how the cluster points up and to the right.) Average scores, furthermore, are high. On Wine Advocate, the average score was 89, on International Wine Cellar it was 91, and it was 17 out of 20 for Jancis Robinson. On CellarTracker, it was 89. This tells us that experts and enthusiasts alike don’t seem to be spending a great deal of time scoring mediocre wines.

As with New York City restaurants, there are many very good wines but few that achieve true greatness. This is something both expert and enthusiast agree on, as indicated by the pointiness of that arrow shape.

So what did we learn about the CellarTracker users? They drink good wine. And when you consider that according CellarTracker’s founder, Eric Levine, the average user has rated 49 wines and there are 2,311 users who have rated more than 500 wines, this group sounds less and less like a rowdy horde of merlot swillers. It sounds like they take their wine seriously. When it comes to online comments sections or restaurant reviews on Yelp, the internet has a reputation for being overly representative of the ill-informed and overly opinionated. This doesn’t seem to be the case with CellarTracker.

Ultimately, we think our analysis is very supportive of community wine reviews. If non-professional wine enthusiasts truly were lacking in knowledge or expertise, we would expect to see little or no correlation with professionals. We saw just the opposite.

But the news isn’t all bad for the professionals. Their divergence in opinion could come down to differences in personal taste. Maybe the reason Jancis Robinson correlates relatively poorly with International Wine Cellar is because she has her own distinctive palate. Perhaps you’re better off finding a critic whose taste matches your own, since these nuances get lost in averages.

In the end, however, CellarTracker offers two compelling advantages. The first is price. Joining CellarTracker is free, whereas Wine Advocate costs $99 for one year and Jancis Robinson costs around $110. The second is breadth. For California alone, CellarTracker covers more wines than all the critics we examined combined.

So if you think it’s worth spending a hundred bucks to access the wine scores published by Wine Advocate, go for it. But you might just be better off putting those dollars toward actual wine. For about $60, you can get a bottle of 2011 Littorai Pinot Noir Savoy Vineyard, which got a 90 from International Wine Cellar and a 92 from Wine Advocate. CellarTracker gave it a 91.

Mark Schatzker is the author of The Dorito Effect: The Surprising New Truth About Food and Flavor. Richard Bazinet is a professor of nutritional sciences at the University of Toronto and the Canada research chair in brain lipid metabolism.

s

Sign up for the newsletter Sign up for The Weeds

Get our essential policy newsletter delivered Fridays.