Category: Writing

Humane Ingenuity 46: Can Engineered Writing Ever Be Great?

A patent drawing of an automated typewriting machine.

As we await the next generation of engineered writing, of tools like ChatGPT that are based on large language models (LLMs), it is worth pondering whether they will ever create truly great and unique prose, rather than the plausible-sounding mimicry they are currently known for.

By preprocessing countless words and the statistical relationships between them from million of texts, an LLM creates a multidimensional topology, a complex array of hills and valleys. Into this landscape a human prompt sets in motion a narrative snowball, which rolls according to the model’s internal physics, gathering words along the way. The aggregated mass of words is what appears sequentially on the screen.

This is an impressive feat. But it has several major problems if you are concerned about writing well. First, a simple LLM has the same issue a pool table has: the ball will always follow the same path across the surface, in a predictable route, given its initial direction, thrust, and spin. Without additional interventions, an LLM will select the most common word that follows the prior word, based on its predetermined internal calculus. This is, of course, a recipe for unvaried familiarity, as the angle of the human prompt, like the pool cue, can overdetermine the flow that ensues.

To counteract this criticism and achieve some level of variation while maintaining comprehensibility, ChatGPT and other LLM-based tools turn up the “temperature,” an internal variable, increasing it from 0, which produces perfect fidelity to the physics, i.e., always selecting the most likely next word, to something more like 0.8, which slightly weakens the gravitational pull in its textspace, so that less common words will be chosen more frequently. This, in turn, bends the overall path of words in new directions. The intentional warping of the topological surface via the temperature dial enables LLMs to spit out different texts based on the same prompt, effectively giving the snowball constant tugs in more random directions than the perfect slalom course determined by the iron laws of physics. Turn the temperature up further and even wilder things can happen.

Yet writing well isn’t about using less frequent words or having more frequent tangents. Great writing forges alternative pathways with intentionality. Styles and directions are not shifted randomly, but as needed to strengthen one’s case or to jolt the reader after a span of more mundane prose. For instance, my writing style for this newsletter, although less serious and less formal than my academic writing style, nevertheless is prone to use the phrase “for instance” and the word “nevertheless.” My sentences tend to be longer than those you might encounter in more casual writing, and I generally avoid starting a sentence with “Anyway,” or ending a sentence with an exclamation point. But sometimes, to underscore my argument, I do use an exclamation point!

Anyway, dialing up the temperature creates variability, leading to different responses to the same prompt; an improvement. But this hack is only on the output side of the LLM; by the time the snowball is rolling around, those hills and valleys are already firmly sculpted by the preprocessing of a distinct slate of texts. In other words, the input of the LLM has already been determined. With many of the LLM-based tools we are encountering today, those corpora are incredibly large and omnivorous. ChatGPT is an indiscriminate generalist in what it has read, because it wants to be able to write on virtually any topic.

Here again, however, there is an obvious issue. Good writing isn’t just the selection and ordering of words, the output; good writing is the product of good reading. Writers aren’t indiscriminate generalists, but tend to be rather choosy and personal about what they read. As humans they also have a fairly limited reading capacity, which means that their styles are highly influenced by idiosyncratic reading histories, by their whim. Good readers can often discern which writers a writer has read, as little stylistic quirks pop up here and there — a recognizable artisanal blend, mixed with some individually developed ingredients. It is hard to see how great writing can come from a model that is a generalist, or from a prompt asking for “a story in the style of” just one writer, or even from an LLM trained on a discerning, highbrow corpus, although each of those might have interesting, skillful outputs.

If we want our LLMs to be truly variable and creative, we would have to train the models not on a mass of texts or even the texts of a set of “good writers” (if we could even agree on who those are!), but on a limited, odd array of texts one human being has ingested over their lifetime, which they think about in relationship to their experience of life itself, and which they process and transform over time. And this begins to sound a lot like a story in the style of Jorge Luis Borges, in which a machine seeks to become a writer to impress human beings, and so it asks someone to assemble a library of great works, and the machine waits patiently for years while its human assistant, engrossed by what they are reading, piles up books next to a comfortable chair.


Subscribe to the
 Humane Ingenuity newsletter:

Humane Ingenuity 45: What AI Tells Us About Art

Library of the Distant Future.png

(“The Library of the Distant Future,” as envisioned by Midjourney, when I was let into the beta in March 2022.)

Before one can become a Cassandra or Pollyanna about the uses or abuses of impressive text-to-image AI tools like DALL•E and Midjourney, it is worth stepping back and reflecting about the fundamental nature of this new technology. What is it actually designed to do?

Just as text generators like GPT-3 are engineered to provide highly plausible sequential arrangements of words, these AI image generators are designed to meet our expectations, visually. This agreeableness is right there in the math, in the way these tools distill millions of images into a multidimensional array of the proximities of various styles and shapes. They angle to be familiar, and from what we have seen so far, they are succeeding.

Note that this familiarity and agreeableness doesn’t mean they won’t surprise us from time to time, or even delight us. Clever questioners have coaxed unusual outcomes out of these tools with creative incantations. But even those images in some way meet our expectations, as they must; they are internally structured, like a golden retriever, to be pleasing to the incantor. I will gladly admit that Midjourney’s rendering of my request to conjure a “Library of the Distant Future” elicited an audible “wow” when it appeared. But then again, it also competently echoed the science fiction book covers of my childhood.

One can’t be churlish about this. I applaud the creativity that has gone into the design of these new tools, and they can be great fun. But they also helpfully highlight, by contrast, the nature of truly creative art.

The best art isn’t about pleasing or meeting expectations. Instead, it often confronts us with nuance, contradictions, and complexity. It has layers that reveal themselves over time. True art is resistant to easy consumption, and rewards repeated encounters. Accomplished paintings challenge easy or unitary interpretations, like Mona Lisa’s smile. The best books are worth reading multiple times, as we discover new elements and are affected differently each time we flip their pages.

As this new field of “AI art” develops, we should push for a higher-order Turing test: Are we inclined to view or read their outputs more than once, to ponder their deeper significance? Or, no matter how remarkable they may be, despite immediate, uncanny evocations of delight or humor or dread, do these images still exhaust their artistic reserves rapidly? If so, what does that tell us?

Complexity can be added to the machine; technologists are surely working on it. But the fundamental urge to meet expectations forms a major developmental barrier.

An AI text generator very well might spin a decent tale about a monomaniacal hunt for a white whale, perhaps even with copious Biblical references, given the right additional nudges, but would that work ever have the strange richness produced by a human writer familiar with the actual manual labor of whaling, and who is able to find layers of meaning in those seemingly mundane processes? An AI music generator very well might create the chord progression and melody of a decent song, with lyrics about envy and marriage, but can it record the heartbreaking plaintiveness of “Jolene” without the human experience of Dolly Parton?

The desire of AI tools to meet expectations, to align with genres and familiar usage as their machine-learning array informs pixels and characters, is in tension with the human ability to coax new perspectives and meaning from the unusual, unique lives we each live. Dolly Parton and Herman Melville worked within genres, and had their own arrays of common references, but they also exploded them in ways that could not be anticipated. That is a different sort of delight, and art.


Subscribe to the Humane Ingenuity newsletter:

Authority and Usage and Emoji

Maybe it’s a subconscious effect of my return to the blog, but I’ve found myself reading more essays recently, and so I found myself returning to the nonfiction work of David Foster Wallace.1 Despite the seeming topical randomness of his essays—John McCain’s 2000 presidential campaign, the tennis player Tracy Austin, a Maine lobster fest—there is a thematic consistency in DFW’s work, which revolves around the tension between authority and democracy, high culture intellectualism and overthinking and low culture entertainment and lack of self-reflection. That is, his essays are about America and Americans.2

Nowhere is this truer than in “Authority and American Usage,” his monumental review of Bryan A. Garner’s A Dictionary of Modern American Usage.3 DFW uses this review of a single book to recount and assess the much longer debate between prescriptive language mavens who sternly offer correct English usage, and the more permissive, descriptive scholars who eschew hard usage rules for the lived experience of language. That is, authority and democracy.

The genius of Garner, in DFW’s view, is that he is an authority on American English who recognizes and even applauds regional and communal variations, without wagging his finger, but also without becoming all loosey-goosey and anything goes. Garner manages to have his cake and eat it too: he recognizes, with the democrats, that English (and language in general) is fluid and evolves and simply can’t be fixed in some calcified Edwardian form, but that it is also helpful to have rules and some knowledge of those rules so that you can express yourself with precision and persuade others. Even democratic descriptivists should want some regularity and authoritative usage because we all speak and write in a social context, and those we speak with and write to, whether we like it or not, pick up on subtle cues in usage to interpret and judge your intent and status within the community. Garner’s fusion of democracy and authority is immensely appealing to DFW; it’s like he’s figured out how to square the circle.

But Garner’s synthesis only works if the actual communication of your well-chosen words is true to what you had mentally decided to use, and here is where the seemingly odd inclusion of emoji in the title of this post comes into play.4 Emoji upset Garner’s delicate balance and upend DFW’s intense desire to communicate precisely because they are rendered very differently on digital platforms. Emoji entail losing control of the very important human capability to choose the exact form and meaning of our words. (The variation in emoji glyphs also contributes to the difficulty of archiving current human expression, but that is the subject of another post.) See, for example, the astonishing variety of the “astonished face” emoji across multiple platforms:

emoji_face

This is, unfortunately but unsurprisingly, an artifact of the legal status of emoji, which, unlike regular old English words, apparently (or potentially) can be copyrighted in specific renderings. So lawsuit-averse giant tech companies have resorted to their own artistic execution of each emoji concept, and these renderings can have substantially different meanings, often rather distant from authorial intent. As legal and emoji scholar Eric Goldman summarizes, “Senders and recipients on different platforms are likely to see different implementations and decode the symbols differently in ways that lead to misunderstandings.” Think about someone selecting the fairly faithful second emoji from the left, above (from Apple), and texting it to someone who sees it rendered as the X-eyed middle glyph (from Facebook; Goldman, deadpan: “a depiction typically associated with death”), or the third from the left (from Google, who knows).

In short, emoji are a portent of a day when the old debate about authority vs. democracy in English usage is a quaint artifact of the twentieth century, because our digital communications have another layer of abstraction that makes it even more difficult to express ourselves clearly. There is no doubt that David Foster Wallace would dropped many foul-mouthed emoji at that possibility.

  1. Since this post is, in part, about the subtleties and importance of word choice, we might quibble here with the term “essays” for DFW’s nonfiction work. Although it is indeed the term stenciled on the cover of his nonfiction books, what is contained therein is more like a menagerie of what might be best, albeit simplistically, called writing, including steroidal book reviews, random journalistic junkets, and non-random literary slam-downs.
  2. Were DFW still with us and reading blogs, which is, let’s admit it, a laugh-out-loud impossibility, he would likely object to this simplification of his essays that in many cases present themselves more like thick description married with extended—Stretch-Armstrong-level extended—philosophical tangents. He would be doubly annoyed with my needling of this point in a footnote, which is a crass and transparent and frankly lame mimicry of DFW himself, although I hope he would have awarded consolation points for the mobius-strip referentiality here. And objectively, the style of DFW’s writing, both his fiction and nonfiction, combined snoot-grade polysyllabic dictionary-grabbers with unexpected but also well-timed f-bombs, and this fusion has always been something of a tell.
  3. The original title of DFW’s Garner review was “Tense Present: Democracy, English and Wars over Usage,” which is, let’s face it, more clever.
  4. N.B. I use emoji as both the singular and plural form, à la sushi, although this is debated and is a perfect case study in authoritarian vs. democratic English usage. Robinson Meyer talks to the prescriptive language experts and Googles the democratic use of emoji vs. emojis in a remarkably DFW-esque piece in The Atlantic.

The Blessay

Sorry, I don’t have a better name for it, but I feel it needs a succinct name so we can identify and discuss it. It’s not a tossed-off short blog post. It’s not a long, involved essay. It’s somewhere in-between: it’s a blessay.

The blessay is a manifestation of the convergence of journalism and scholarship in mid-length forms online. (For those keeping track at home, #7 on my list of ways that journalism and the humanities are merging in digital media). You’ve seen it on The Atlantic‘s website, on smart blogs like BLDGBLOG and Snarkmarket, and on sites that aggregate high-quality longform web writing.

Some characteristics of the blessay:

1) Mid-length: more ambitious than a blog post, less comprehensive than an academic article. Written to the length that is necessary, but no more. If we need to put a number on it, generally 1,000-3,000 words.

2) Informed by academic knowledge and analysis, but doesn’t rub your nose in it.

3) Uses the apparatus of the web more than the apparatus of the journal, e.g., links rather than footnotes. Where helpful, uses supplementary evidence from images, audio, and video—elements that are often missing or flattened in print.

4) Expresses expertise but also curiosity. Conclusive but also suggestive.

5) Written for both specialists and an intelligent general audience. Avoids academic jargon—not to be populist, but rather out of a feeling that avoiding jargon is part of writing well.

6) Wants to be Instapapered and Read Later.

7) Eschews simplistic formulations superficially borrowed from academic fields like history (no “The Puritans were like Wikipedians”).

I suspect readers of this blog know the genre I’m talking about. Am I missing other key characteristics of the blessay? What are some exemplary instances?

UPDATE: Unsurprising griping about the name on Twitter. Please: give me a better name, one that isn’t confused with other genres. Other suggestions: Giovanni Tiso: “essay” (confusing, but gets rid of the hated “bl”); Suzanne Fischer likes Anne Trubek’s suggestion of “intellectual journalism” (seems to favor the journalism side to me). As I’ve said in this space before, writing is writing; I’d love to call this genre just “the essay” or, yes, “writing,” but I wrote this post because I believe if we go that route the salient characteristics of the genre will be lost in a night in which all cows are black.

UPDATE 2: Much headway being made on Twitter in response to this post. Yoni Appelbaum puts his finger on it: “It’s not journalism. It’s not blogging. It’s practicing the art of the essay in the digital space.” That’s right. Thus Yoni’s suggestion for a name: “Simplest is sometimes best. These are Digital Essays – composed, distributed, and tailored for the format.” Anne Trubek and Tim Carmody worked to define the audience. Anne spoke of readers of the print Atlantic, the New Yorker, and other middle brow gatherings, and authors like Trilling. Tim responded: “The audience for this is similar: para-academic, post-collegiate white-collar workers and artists, with occasional breakthroughs either all the way to a ‘high academic’ or to a ‘mass culture’ audience.”

UPDATE 3: Back to the name: Some perhaps better suggestions are surfacing. Sarah Werner mentioned a word I often use in this space for the genre: “pieces.” Anne Trubek gives it that classic modifier: “thought pieces.” Kari Kraus reminds me that MediaCommons uses “middle-state,” which has some charms, but is a bit opaque.

UPDATE 4: So of course Stephen Fry would beat me to the coinage of “blessay” (thanks, Dragonweb). Again, the point of this exercise is less about the name than about a set of traits. A blessay—or whatever we want to call it—isn’t just a long blog post or a short academic article posted online. It has certain stylistic elements. And it doesn’t rule out other kinds of intelligent online writing.