In 1996, Microsoft unleashed Clippit, better known as Clippy, on users of Microsoft Office. The legendarily irritating mascot-helper spent the following years hovering around the edges of documents, blinking dumbly under his lascivious eyebrows and blurting out, “It looks like you’re writing a letter,” until he was sidelined by the company in 2001, officially recognized as a mistake. Clippy’s problems were manifold. He announced his presence, via a personified avatar, to tell us something that we already knew (or that should have been obvious in the first place) and then proudly offered us little in the way of actual help. He sat and watched us and learned nothing, and repeated himself. He said too much and did too little.
Nevertheless, over 20 years later, the spawn of Clippy are hiding everywhere, guessing what we’re trying to do and offering to help. But Clippy’s successors are doing their best to avoid his mistakes. Most of the time they are faceless, and if they speak, they do so in a disembodied but humanlike voice. They tend to wait to be asked for help, rather than telling us what they think they know unprompted. And when they do offer help, they tend to be more subtle, more accurate or both. They have perhaps more in common with Clippy’s unassuming partners, like Spelling and Grammar Check or AutoCorrect, which spoke through red underlines or small actions carried out on reasonable assumptions (who would intentionally type “teh”?). These tools have followed us and our clumsy fingers to our new smartphones, where they have become both more assertive and more useful, correcting us and only occasionally requiring us to correct them back, and learning all the while.
What does the tech industry want to assist us with now? Email. If you use Gmail, you’ve probably interacted with either Smart Reply or Smart Compose, whether or not you know them by name. Google introduced Smart Reply in 2015, and Smart Compose began rolling out this year. Both, in execution, are self-explanatory. Smart Reply suggests canned responses to inbound emails, based on the company’s best guess at what most emailers might be about to type. The suggestions are short, peppy and often adequate, at least as a start. Sometimes their tone prompts unhappy realizations about what Gmail sees in us. The frequency with which they use exclamation marks emphasizes just how peculiar the language of professional email communication has become (“Sounds great!” “Very cool!” “Love it!”). Smart Compose, in contrast, offers word and phrase suggestions, based on similar judgments, as the user types in real time. You write “Take a look,” and ghostly text might appear to its right: “and let me know what you think.” Its assumptions are more personalized, and they feel that way because it is constantly, visibly, guessing what you’re thinking.
Smart Compose and Smart Reply are, at their core, artificial-intelligence technologies: They are programmed to perform tasks, but also to adapt. To start, Smart Reply was trained on publicly available bodies of email text. (Among the most widely used for such projects is the cache of some 500,000 emails collected during the discovery phase of the Enron investigation.) “What makes machine learning different from regular programming is you look at corpuses of data to make guesses about things,” says Paul Lambert, a product manager for Gmail. “You create a model.”
Once that model was trained to deal with some of the more obvious idiosyncrasies of email communications — corporate disclaimers and phrases like “Sent from Outlook” — Google began training it on anonymized text from actual Gmail users. Phrases that appear frequently enough come under consideration for inclusion in Smart Reply. This, too, requires cleanup. Early testers reported seeing “I love you” as a suggested response to work emails.
CreditIllustration by Jon Han
Armed with this catalog of phrases — currently more than 20,000, according to the company — the model can then start incorporating more contextual clues: What was the subject of the email? Is the email asking a question? Is it expressing a happy sentiment, or is it offering condolences? Phrases are scored based on their utility — how much typing they save, basically — as well as the A.I.’s confidence in the prediction.
Both features then take into account how people use them. If, for example, it suggests a certain completion, and enough users take it, that one will be more likely to appear in the future. If a canned reply is never used, this is a signal that it should be purged; if it is frequently used, it will show up more often. This could, in theory, create feedback loops: common phrases becoming more common as they’re offered back to users, winning a sort of election for the best way to say “O.K.” with polite verbosity, and even training users, A.I.-like, to use them elsewhere. Such a dynamic would take root only where a behavior is already substantially automated — typed, at work, more as a learned performance rather than as an expression of will, or even an idea. Smart Compose is, in other words, good at isolating the ways we’ve already been programmed — by work, by social convention, by communication tools — and taking them off our hands.
Using these features is a bit like minding a machine that is trying to learn how to do what you do for a living. And even if it’s the part of the job you wish you didn’t have to do, it still prompts uncomfortable thoughts of replacement — or, if not replacement, then something close to it. It is not remotely implausible that in the near future, a tremendous amount of communication could be conducted in tandem with an A.I.
But constant sweeping changes in office communication — from speaking and writing to phones and printing to emailing and instant messaging — do not tell a tidy tale of increased efficiency or decreased workload, even as they represent progress. Already, an undefined but undeniable portion of workplace email amounts to human self-automation: an uncanny form of communication where clichés aren’t shunned so much as recognized for their usefulness; where a tone of polite enthusiasm is taken to its exclamatory extreme to mash any ambivalence you may have about, say, “circling back later.” One can visualize in the near future hundred-email chains between colleagues unfurling from a single human starting point, composed of nothing but routinized replies.
Depending on what your current inbox looks like, this might not require much imagination at all. A study conducted in 2016 by researchers at Carleton University’s Sprott School of Business in Canada tried to understand the role email had come to play in the modern office. They surveyed “highly educated baby boomer or Gen X” subjects who were mostly “managers or professionals” working in office jobs and found that they spend on average a full third of their workweeks “processing” email. Whatever their titles, they are — like many office workers — to a large extent professional emailers. Even if their roles are otherwise highly specialized, in this significant way they are not. They are their own assistants.
In 1930, John Maynard Keynes wrote that, thanks to new efficiencies, workers of the future could expect “three-hour shifts or a 15-hour week.” He guessed that this would happen within a century. Automation and the abundance it produced has indeed led to countless economic changes, but it did not negate or replace the entire order. Asked for evidence of the success of this newest tool, Google says that Smart Compose is already “saving people a billion characters of typing each week.” This statistic supports one half of what Keynes might have predicted at the dawn of automated communication — the abundance and the glut — but is tellingly silent on the other half, the same half he couldn’t quite see the first time. Self-automation can free us only to the extent that it actually belongs to us. We can be sure of only one thing that will result from automating email: It will create more of it.