Who wrote the anonymous White House op-ed? A linguistic analysis.

A systematic, but not definitive, analysis of the language and what it might tell us about the author.

The White House. (Ei Kebir Lamrani / AFP via Getty Images)
Someone in Trump’s White House wrote the op-ed. Who could it be?
Did the word “lodestar” give away Vice President Mike Pence as the author of the anonymous op-ed in the New York Times? It’s an unusual word, to be sure. But language is about more than one person’s ill-fated studying for the GRE.

Political communication scholar Roderick Hart writes, “Words, that is, are not important in and of themselves. They are important because they point to the speaker’s feelings and the situations in which they find themselves. Words are shaped by cultural experiences, and they point back to those experiences. … Words also point to the epistemological assumptions people make.” Hart also points out that people — including politicians and speechwriters — vastly overestimate their ability to monitor the language they use.

This description is part of an explanation of the theory behind the content analysis software Diction. In the spirit of what Hart writes about language, I used this software to compare the anonymous op-ed to a number of speeches by people on the “suspect list.” The list of people is certainly incomplete and based on what I saw scanning through Twitter.

It’s important to note that speeches are written by speechwriters, and the degree of input from the person who is writing the speech can vary. To the extent that the table below displays linguistic similarities between speeches and the op-ed, it might reflect that the piece was written by someone who has worked in those departments or agencies. Or it might mean nothing at all.

There’s no “aha” moment in the table below. In fact, I’ve included two speeches by Barack Obama (at the beginning and end of his presidency) to show how much difference there can be across communications from the same person. Scholars have used this technique, in fact, to show how politicians’ rhetorical approaches have evolved over time.

Still, I offer the following in the spirit that if we are going to try to use language to think about who penned the anonymous piece, we should do so systematically, using established tools that are rooted in real theories of language and politics.

The table below features the scores for a bunch of texts for speeches and texts from individuals on the op-ed “suspect list,” using five linguistic features: activity, optimism, certainty, realism, and commonality. Similar scores indicate greater degrees of commonality. They are defined (here is a more scholarly definition also):

  • Certainty: Language indicating resoluteness, inflexibility, and completeness and a tendency to speak ex cathedra.
  • Activity: Language featuring movement, change, the implementation of ideas and the avoidance of inertia.
  • Optimism: Language endorsing some person, group, concept, or event, or highlighting their positive entailments.
  • Realism: Language describing tangible, immediate, recognizable matters that affect people’s everyday lives.
  • Commonality: Language highlighting the agreed-upon values of a group and rejecting idiosyncratic modes of engagement.”

Each of these scores is derived by taking the difference between words that belong in the category that indicates certainty, optimism, etc. and those that exemplify its opposite — ambivalence, pessimism, etc.

I did a quick Diction analysis using a bunch of speeches — one from Nikki Haley, one from James Mattis, one from Dan Coats (August 4, 2017), and two from Steven Mnuchin. I used several Sessions speeches from his time as attorney general, in a variety of contexts, and several from Mike Pence. The Pence sample includes his RNC speech, a speech to the NRA, and two State of the State addresses from when he was governor of Indiana. I used Ivanka Trump’s speech from the 2016 RNC. Finally, I included two speeches from Obama — his first inaugural, and his farewell address from January 2017.

I took the average if there was more than one speech from an individual. The choice of whom to include, how many speeches to use, and which ones was somewhat arbitrary. I did try to get an interesting cross-section of different political speeches.

Here are the three closest, with differences in parentheses.

Top 3 matches

Activism Optimism Certainty Realism Commonality
Obama (.15) Haley (-.46) Mattis (-.22) Coats (-.1) Haley (-.89)
Haley (-.67) Sessions (.51) Obama (.24) Mnuchin (.125) Pence (1.228)
Coats (-.7) Kelly (-1.39) Pence (-.8) Sessions (2.62) Coats (1.3)

There are several variables, obviously, and different numbers of observations for each individual. There’s some challenge in comparing speeches to an op-ed, but my assumption is that the different genres will provide some protection against scores that reflect conventions of certain genres, and instead pick up on tendencies in word use, thinking, and broad political approach. This approach supplements, and differs from, analyses that have creatively used Twitter to identify language similarities between tweets and the op-ed piece.

Coats and Haley appear three times each on the “leaderboard,” while Sessions and Pence come up twice, along with known “deep state” operative Barack Obama. So this isn’t very definitive, but it does give us a more general sense of which top-level officials the op-ed sounds most like, drawing on a deeper sense of language than we can get without computer assistance.

It’s worth noting that while Pence and Sessions were not especially similar to the op-ed’s linguistic tendencies, if we look at a couple of individual speeches, there are some pretty close scores. Sessions gave a speech to the Des Moines Rotary Club that’s very close to the op-ed on three of the five dimensions — optimism, certainty, and commonality — and another in July 2017 to the National District Attorneys Association that’s close on two dimensions. Pence’s 2016 State of the State was very close for activity and certainty. (But these differences were still larger than the Obama ones shown in the table!)

So we have evidence that some individuals are closer linguistically to the mystery op-ed writer than others. But we also have evidence of conventions of genres; it’s not surprising that op-eds would be more similar to ceremonial speeches like Haley’s United Nations address or the Obama selections. Linguistic analysis is informative, but we still need to proceed with caution, especially since no one individual emerges as the obvious “frontrunner.”

At a time when the paper of record is publishing an anonymous declaration of “resistance” from within the White House, we desperately need to defend information, transparency, and calm, systematic thinking. Whether the piece was written by a major figure or a lower-level official remains unclear, and there are compelling reasons on both sides of that argument. We will probably find out who wrote the piece sooner rather than later, and that person’s position and past will inform how we understand what’s been said. But ultimately, there may not be anything as important as the fact that such a piece was written in the first place.