One of the most bitter and contentious presidential elections in U.S. history is finally behind us. It seemed that American democracy itself was under threat, and left voters on both sides of the political spectrum feeling jaded and demoralized. There were certain features of this election that made it particularly unsavory – not the least of which was an incumbent president who was unwilling or unable to accept the outcome, and the cronies and lackeys of his who supported him in his delusion – and will probably result in its being remembered as one of America’s darker moments. For me, it brought to mind the fictional election portrayed in the classic American western, The Man who Shot Liberty Valance. In that movie, a young attorney from the east, named Ranse Stoddard, played by James Stewart, moves into an unnamed western territory and leads a movement to turn that territory into a state. He almost immediately runs afoul of a local bully named Liberty Valance, well-portrayed by Lee Marvin, who is at the head of a faction of cattle barons opposed to statehood. When Stoddard is elected as a delegate to the statehood convention, along with the local newspaper editor, who has published a story about some of Valance’s crimes, Valance beats the editor nearly to death, burns down the newspaper building, and challenges the attorney to a gunfight. When Stoddard reluctantly faces off against him in a showdown, Valance is shot to death, much to the surprise of the local townsfolk, not to mention Stoddard himself. Stoddard and his allies ultimately succeed in their drive for statehood, and he becomes a U.S. senator representing that state.
But aside from the ugly
controversies surrounding it, the election did bring some longstanding criticisms
about the voting process to light. The
Electoral College, in particular, in which each state is given a number of
votes equivalent to the sum of its senators and representatives in Congress,
came under renewed scrutiny. A perennial
complaint about the Electoral College is that it gives disproportionate power
to the less populous states. According
to the 2010 Census, for example, California gets one electoral vote for each
700,000 persons in the state, while Wyoming gets one electoral vote for each
200,000 persons in the state. There have
been four elections in U.S. history when a president who won the Electoral
College vote actually would have lost by popular vote, and two of these have
occurred in the last twenty years (Al Gore vs. George W. Bush in 2000, and
Hilary Clinton vs. Donald Trump in 2016).
Ironically, however, it can be proven mathematically that, under the
Electoral College voting system, a single voter in a more populous state has a
higher statistical probability of affecting the outcome of the election than a
voter in a less populous state. But in
any case, the fact that this method potentially produces outcomes that differ
from those which would occur with a simple count of the popular vote is
very galling to many.
There are even more serious issues that arise, however, when there are more than two candidates running in an election, that even makes the outcome of a simple popular vote open to criticism. These third-party candidates, if they are sufficiently popular, potentially become “spoilers” in the election, drawing away votes from one of the two majority party candidates with whom their views are most closely aligned. Many Democrats worried that just such a thing might have happened in the most recent presidential election if Bernie Sanders had decided to run on a third party ticket. While there have been many popular third-party candidates in American presidential races over the past century, (George Wallace in 1968, John Anderson in 1980, and Ralph Nader in 2000, 2004, and 2008), the one who is generally remembered as possibly having changed the outcome of an election is Ross Perot, who ran as a third-party candidate against incumbent George H.W. Bush and Bill Clinton in 1992. Perot’s moderately conservative views were generally considered to be more aligned with Bush’s than Clinton’s, and therefore his strong showing (he received 19% of the popular vote) was thought by many to be responsible for President Bush’s failure to be re-elected. While this conclusion has been debated (in spite of his strong showing in the popular vote, Perot received no votes in the Electoral College), Perot’s performance demonstrated the impact that a strong third-party candidate could have in elections where the winner is determined by a simple majority of votes.
Some countries have tried to address the problem of selecting among multiple candidates by adopting more sophisticated voting methods, for example by allowing voters to rank candidates from most preferred to least preferred. In a three-person contest for example, a voter’s first choice could be given 2 points, the second choice 1 point, and the third choice no points, and then the points would be totaled among all voters to determine the winner. But even this method, as logical as it sounds, has been demonstrated to sometimes produce strange outcomes that seem to contradict the general will of the majority. Consider, as a simple example, an election with three candidates, A, B, and C, and five voters. Suppose that three of the voters, a majority of them, rank the candidates as follows (in descending order): C-A-B. The other two voters rank the candidates: A-B-C. Assigning 2 points to each of the first place votes, 1 point to each of the second place votes, no points for third place votes, and totaling, Candidate A gets 3 points for being the second choice of three voters (3×1), and 4 (2×2) points for being the first choice of two voters, for a total of 7 points. Candidate B gets a total of 2 points (0 points from three voters and 1 point from two voters), and Candidate C gets a total of 6 points (2 points from three voters and 0 points from two voters), making Candidate A the winner, with the highest total of 7 points. But three of the voters, a majority, had preferred Candidate C to Candidate A, which casts doubt on the reasonableness of selecting A as the winner. Such paradoxes are not uncommon using this method, and in fact Nobel Prize-winning economist Kenneth Arrow proved that rank-ordering voting methods of this sort can never be devised to prevent these strange outcomes from occurring.
But two economists, Michel Balinski and Rida Laraki, in their 2007 book, Majority Judgment, make a compelling case for why a slight variant of the rank-order method actually can produce consistently valid outcomes. Rather than using vote totals based on a preference ranking, the authors contend that a better method is to have each voter evaluate the total slate of candidates, with specific evaluations ranging from positive (approve) to negative (disapprove). Evaluations for each candidate are then stacked from most favorable to least favorable, and the median (middle) evaluation is assigned as a rating to that candidate. The candidate with the highest rating wins. Consider the same example above, with Candidates A, B, and C, and that a top-choice ranking from voters is equivalent to an evaluation of “approve”, a bottom-choice ranking is equivalent to an evaluation of “disapprove”, and a second-choice ranking is considered “neutral” (i.e., neither “approve” nor “disapprove”). Candidate A then received 3 “neutral” votes and 2 “approves”, giving it a median rating of “neutral”, since if we stacked these votes from most favorable to least favorable, then the evaluation in the middle of the stack would be one of the “neutrals”. Similarly, Candidate B’s 3 votes of “disapprove” and 2 votes of neutral gives it a median rating of “disapprove”, and Candidate C’s 3 votes of “approve” and 2 votes of “disapprove”, gives it a median rating of “approve”. Candidate C, then, has the highest rating among voters, with a median of “approve”, followed by Candidate A with “neutral” and Candidate B with “disapprove”. The selection of Candidate C seems a more logical outcome in this election, since the majority of voters preferred Candidate C over both Candidate A and Candidate B.
A Three-Way Election Outcome Using Majority Judgment Method
It is interesting to consider what would have happened if this method had been used in the most recent presidential election. Suppose that the method proposed by the authors of Majority Judgment was used with the following five ratings available to voters (from best to worst): “strongly approve”, “approve”, “neutral”, “disapprove”, and “strongly disapprove”. Given the extreme divisiveness that characterized this election, with voters for one of the candidates generally strongly detesting the other, it is not unlikely that the voters who selected Biden would have given him a “strongly approve” rating, while giving Trump a “strongly disapprove” rating, and Trump voters would have done the reverse: giving Trump a “strongly approve” rating and Biden a “strongly disapprove” rating. Since Biden was preferred by a majority of the voters, he would have had more “strongly approve” ratings than “strongly disapprove” ratings, and his median rating would therefore be “strongly approve”, while, for the same reason, Trump’s median rating would be “strongly disapprove”. These results, then, would have mirrored what actually happened in the Electoral College and the popular vote.
But now suppose that Bernie Sanders had decided to run as a third-party candidate. Under either the Electoral College system or the simple popular vote, the entry of Bernie Sanders into the race would have almost certainly siphoned off a significant number of votes from Joe Biden, and could very likely have resulted in the election victory going over to the incumbent President Trump. However, the median rating approach would produce a distinctly different result. Suppose that those who supported Biden over Bernie Sanders gave Biden a “strongly approve” rating and Sanders an “approve rating” while those who supported Sanders over Biden did the reverse. Assume that both, however, still gave Trump a “strongly disapprove” rating. Since Trump’s median rating would still then be “strongly disapprove”, he would again be the clear loser of the election. The ultimate contest, then, would be between Biden and Sanders. (Both of these candidates would probably now have a median rating of “approve”, suggesting a tie, but the authors of Majority Judgment provide a simple and elegant method for breaking ties. In this case, the method would have selected as winner the more popular of these two candidates, Biden and Sanders.)
The voting method
advocated in Majority Judgment is effective and suitable not only for
elections, but for competitive activities that involve multiple judges,
including sporting events such as the Olympics, and wine-tastings. And it is to one of these that I would now
like to turn, because as with our recent U.S. presidential election, the
outcome was controversial, but unlike the election, it is remembered as a great
moment in American history. It was the
famous “Judgment of Paris” wine competition of 1976, in which American wines
were pitted against French wines in a blind tasting.
A little background is
necessary in order to highlight the significance of this competition. Before 1976, French wines were generally
regarded as the best in the world. And
more than this, they were considered virtually unrivalled in their
quality. While some other European
nations could lay claim to particular wines of excellence (Spain had its
sherry, Portugal its port, Italy its Chianti, and Germany its sweet white wines, for example), the
idea that any country beyond Europe could produce wine of any type that could
even compare to France’s was unthinkable, even heretical, and this was
especially true of wines produced in America – particularly in California,
where many wines using grape varietals identical to France’s were produced. And wine producers in California at the time
even seemed to believe this themselves, as they often resorted to naming their
white wines “Chablis”, red wines “Bordeaux”, and sparkling wines “Champagne”,
which are all regions in France. The
French eventually raised a successful protest against this practice, as it was
clearly a case of false advertising: what we might today even call “identity
theft”. The practice really did seem to
be a tacit acknowledgement on the part of California wine-makers that their products
were inferior imitations of the French bona fides.
A British wine merchant named Steve Spurrier decided to put this belief about the inferiority of California wines to the test. (He was himself a believer, as he only sold French wines in his own shop.) He arranged for a blind tasting in Paris involving several acclaimed judges which would include a red wine competition (four French Bordeaux wines against six California Cabernet Sauvignons) and a white wine competition (four French Burgundy and six California Chardonnays). The tasting occurred on May 24, 1976. There were eleven judges in all, including eight French, one Swiss, one American, and Spurrier himself. When the tastings were completed, and the outcomes of the competitions were determined, it was revealed – much to the shock (if not horror) of the French judges – that a California wine had won first prize in both the red and white wine categories. (The event is entertainingly portrayed in the 2008 movie Bottle Shock.)
While news of this
competition and its outcome was downplayed in Europe, and particularly in
France, the impact of the event was momentous.
By winning first prize, California winemakers had demonstrated that they
could produce quality wines on a par with the vaunted French wines that were
supposedly incomparable in their excellence.
The event virtually opened the floodgates to a vibrant international
wine industry, as not just wineries in California, but others in both North and
South America, as well as Australia and New Zealand, not to mention Europe
itself, felt emboldened to challenge France’s winemaking dominance, in terms of
popularity, or quality, or both. It
seemed that just knowing that such a thing was possible – a non-French wine
winning first prize in a blind tasting – had a palpable impact on the
industry. In that first competition,
most of the American red wines that had been part of the tasting actually did
end up on the bottom of the ranking. But
when the Wine Spectator magazine hosted a France vs. U.S.A. tasting
competition just ten years later, five of the six American red wines entered in
the contest occupied the top five positions in the rankings. It is a testament to the power of belief, and
reminiscent of the famous story of Roger Bannister, who was the first person in
history to run the mile in less than four minutes, in 1954, something which
until then had been thought to be humanly impossible. Within a year of his winning the record, 37
other runners also ran the mile in less than four minutes, and 300 runners
within a year after that. By just demonstrating
that the feat could be accomplished, he made it far more achievable for
others. The 1976 “Judgment of Paris” and
its outcome was truly legendary, and its legacy can be seen today in just about
any store that sells wines, where there are aisles devoted to individual
regions, with California Cabernet Sauvignons, Sauvignon Blancs, and Chardonnays
proudly displayed, along with similarly-esteemed Argentinian Malbecs and red
blends, German Rieslings, and the now globally popular Australian brand with the
kangaroo on the label, featuring various varietals at a very affordable
price. French wines are still respected
of course, and still popular, but no winery outside of France, in any region of
the world, now feels compelled to try to pass off any of its products as a
“Bordeaux”, or “Chablis”, or “Champagne” in order to gain respectability, or
even popularity.
But here is where the
story takes a bit of a left turn. Did a
California wine really win 1st Prize in the 1976 Judgment of
Paris? I have to return here to the
authors of Majority Judgment, who had demonstrated that their method of
determining election outcomes, and outcomes of competitions involving several
judges, was superior to traditional methods, and devoid of all of the
shortcomings attributed to them. (Even
Kenneth Arrow, the economist who had demonstrated that all traditional methods
of rank-ordering candidates were flawed, endorsed the authors’ approach.) In their book, they turn their attention to
the Judgment of Paris, and in particular the red wine competition, and note
that the outcome was determined by taking a simple average of each of the
judges’ scores. But by using their
method, the authors contend that the American wine which supposedly won 1st
prize, the 1973 vintage Stag’s Leap Cabernet Sauvignon, actually should have
taken second place in the competition, with the real winner being the 1970
vintage French wine Chateau Mouton Rothschild, which had been given 2nd
prize in the official ranking. (Both the
official ranking and the authors’ ranking concur that four of the six American
wines entered in the competition occupied the four lowest positions, and that
the remaining one came in 5th place.)
This is a jarring
conclusion, and leads to a profoundly different outcome for this 1976
event. In fact, had this been the
outcome that was officially observed at the time, it might have diminished or
even eliminated the event’s historical significance. After all, in the red wine category, five of
the six American contestants ranked very poorly, or mediocre at best. Hence, a 2nd place showing for
Stag’s Leap might have been considered just an anomaly of no particular
consequence. For Americans, at least,
this might make the authors’ voting methodology appear much less attractive. (Some who dislike this outcome, and who are
familiar with the book Majority Judgment, might even be tempted to
observe that both of its authors were employed at a French university at the
time when their book was published.) But
are the authors correct, nonetheless?
I have always been
fascinated myself with the problem of how to properly rank order candidates or
contestants based on a voting methodology, and have actively explored various
approaches to this problem. Several years
ago, I came upon an insight which led to the development of my own
methodology. The insight was this: In a competition that involves several
judges, there are actually two types of information that are revealed in the
judges’ scores. The first, of course, is
information about the things being judged, but the second is information about
the caliber of the judges themselves. If
the particular scores of an individual judge tend to correlate highly with the
average scores of the other judges, for example, then this is probably a good
indication that the judge knows what he or she is doing. For example, suppose three things are being
evaluated – call them A, B, and C – by several judges, and Judge #1 determines
that C is the best among the three, followed by B, followed by A. If the collective ratings of the other
judges, based on an average of their point scores, also puts C at the top,
followed by B, and followed by A, then this suggests that Judge #1 has made a
competent evaluation. But if the collective
ratings of the other judges does the reverse, putting A over B over C, then
this suggests that Judge #1 either lacks the ability to discriminate
effectively between the contestants, or has an aesthetic taste that runs
counter to the population as a whole, or both.
It is also possible, of course, that Judge #1 is uniquely and
exceptionally qualified to perform this role, and it is the rest of the judges that
are incompetent rubes, but the former interpretation is far more likely,
particularly if several judges are involved.
Hence, a judge whose individual scores are highly and positively
correlated with the average scores of the other judges should get a higher
weighting in the competition that is being evaluated, while any judge whose
individual scores are either uncorrelated with the average of the others, or,
worse, negatively correlated with the average, should be given a lower
weighting, or perhaps should be disqualified entirely. I have tested my method, using a technique
involving multiple simulations called Monte Carlo analysis, and have found
encouraging evidence that mine is superior to both the conventional method of
simply averaging judges' scores and that which is proposed by the authors of Majority
Judgment.
And when looking at how
each of the individual judges’ ratings compared with the others at the Paris
competition, one can make some interesting observations. For example, one of the French judges, Pierre
Tari, actually did exhibit a negative correlation between his wine ratings and
those of the rest of the judges, meaning that he tended to rate highly those
wines that the other judges were unimpressed with, and vice versa. In his case, then, it really does appear that
in spite of the fact that he actually owned a winery himself, as a wine connoisseur
he was apparently in the wrong profession.
(This may sound rather
harsh, but I must confess that my own experience as a wine connoisseur
parallels Monsieur Tari’s. Many years
ago I took a wine course, and it was customary to end each class with a blind
tasting of the wines that had been featured that day. I discovered that I invariably rated highly
those wines that were disliked by the rest of the class, while panning the
wines that were popular with everyone else.
It was very humbling at the time, but there was a consolation for
me. Because the wines that I preferred
also tended to be the less expensive ones, I have been able to enjoy my
favorite wines in the years since without it ever being a serious drain on my
budget.)
And three of the judges had ratings which, while positively correlated with those of the other judges, were only slightly so, suggesting that much of their ratings were not much better than assigning random scores. In their case, this would be evidence of a palate that was not very discriminating. One of these judges was the only American who participated, Patricia Gallagher, another was the British wine merchant who had proposed the contest, Steven Spurrier, and the third was Swiss wine instructor and author of books on wine, Michel Dovaz. (Spurrier and Gallagher actually recused themselves from the competition and did not include their scores in the final tally, not because they felt that they were less than competent to judge, but because both had played an instrumental role in organizing the event.)
But there were four
judges, all French, that were standouts in a positive way, in that each of
their ratings correlated strongly with the averages of the other judges,
strongly suggesting that they had both a discriminating palate and a genuine
cultivated taste for fine wines. These
were Claude Dubois-Millot, a restaurant sales director who was actually
substituting for an absentee judge, economist and winery owner Aubert de
Villaine, restaurant owner Jean-Claude
Vrinat, and Pierre Brejoux, Inspector General of the Appellation d'Origine
Controlee Board, which oversees the production of the finest French wines. Of these four judges, two put the California Stag’s
Leap wine in a tie for second place, and two put it in a tie for third
place. Hence, while all of these judges
agreed that Stag’s Leap was among the top four of the wines in the competition,
they also were unanimous in determining that it was not the best of the
contestants. (By the way, Patricia
Gallagher, the only American judge, rated Stag’s Leap even lower, putting it in
a four-way tie for fourth place.)
The remaining three French
judges exhibited positive correlations between their individual scores and
those of the rest of the judges, but which were not very strong, indicating
that while they had a general ability to discriminate between good wines and
mediocre ones, it would perhaps be too flattering to refer to them as
“connoisseurs”. One of these, Odette
Kahn, upon discovering to her horror that she had assigned first place to the
American Stag’s Leap wine, demanded, unsuccessfully, to have her ballot
returned.
When I use my own method
of weighting the judges’ scores, I find that much depends on exactly how these
scores are weighted. With some weighting
methods, my results support the contention that Stag’s Leap was the genuine winner,
but with others, the 2nd place showing which the authors of Majority
Judgment claims it actually deserved is supported. In any case, the fact that the four apparently most competent judges were in agreement that Stag’s Leap was not the best
wine in the competition casts serious doubt on the official outcome of the
contest.
So the actual outcome of
the famous “Judgment of Paris” may very well have been different than what is
recorded in the history books. It reminds
me, again, of that movie, The Man Who Shot Liberty Valance, and the
fateful gunfight in which the greenhorn, idealistic lawyer from the East Coast brought
down a much more skilled opponent. The
showdown is remembered as a great, seminal moment in the history of the
fledgling state, but during the movie (and, as a film critic would say at this
point to those who haven’t seen the movie, “Spoiler Alert!”) it is revealed to
a reporter many years later that it was not Ranse Stoddard’s bullet that
actually killed Liberty Valance.
If you ever have a chance
to come out to Washington, D.C., I recommend that you visit the Smithsonian
Institution’s National Museum of American History, which is located on
Constitution Avenue, NW, between 12th and 14th
Streets. In a permanent exhibition
located on the first floor of the East Wing entitled “Food: Transforming the
American Table”, you might still have an opportunity to see proudly on display
an actual bottle of the Judgment of Paris “winner”, 1973 Stag’s Leap Cabernet
Sauvignon, which was donated to the museum by the winery’s owner, William
Winiarski. I sometimes imagine myself standing there,
admiring the exhibit, while some group is standing next to me, listening to
someone describe the historic David vs. Goliath contest that allowed Stag’s
Leap to bring respectability to American wines, and usher in a new era for wine
in the entire world. If that ever
happens, I will simply smile silently, and nod in agreement, while remembering
the most famous line in that movie, The Man Who Shot Liberty Valance,
uttered by the reporter when he learns the truth about what actually happened
during the climactic showdown between Stoddard and Valance:
“When
the legend becomes fact, print the legend.”
I know that in these days
of real and imagined incursions and accusations of “fake news”, this attitude
of mine might be controversial, but I do believe that there will always be a
place for myth in history, if the myth heals, unites, and inspires, rather than
injures, divides, and enervates. And so,
in these divisive times, I offer a toast to the American ideal of unity in
diversity, accomplishment in spite of adversity, and hope that in this ideal,
“legend” will always become “fact”, and vice versa. Or as the French would say, so simply and
elegantly:
Je lève mon verre à la liberté!