Blog – Dan Cohen

Humane Ingenuity 46: Can Engineered Writing Ever Be Great?

A patent drawing of an automated typewriting machine.

As we await the next generation of engineered writing, of tools like ChatGPT that are based on large language models (LLMs), it is worth pondering whether they will ever create truly great and unique prose, rather than the plausible-sounding mimicry they are currently known for.

By preprocessing countless words and the statistical relationships between them from million of texts, an LLM creates a multidimensional topology, a complex array of hills and valleys. Into this landscape a human prompt sets in motion a narrative snowball, which rolls according to the model’s internal physics, gathering words along the way. The aggregated mass of words is what appears sequentially on the screen.

This is an impressive feat. But it has several major problems if you are concerned about writing well. First, a simple LLM has the same issue a pool table has: the ball will always follow the same path across the surface, in a predictable route, given its initial direction, thrust, and spin. Without additional interventions, an LLM will select the most common word that follows the prior word, based on its predetermined internal calculus. This is, of course, a recipe for unvaried familiarity, as the angle of the human prompt, like the pool cue, can overdetermine the flow that ensues.

To counteract this criticism and achieve some level of variation while maintaining comprehensibility, ChatGPT and other LLM-based tools turn up the “temperature,” an internal variable, increasing it from 0, which produces perfect fidelity to the physics, i.e., always selecting the most likely next word, to something more like 0.8, which slightly weakens the gravitational pull in its textspace, so that less common words will be chosen more frequently. This, in turn, bends the overall path of words in new directions. The intentional warping of the topological surface via the temperature dial enables LLMs to spit out different texts based on the same prompt, effectively giving the snowball constant tugs in more random directions than the perfect slalom course determined by the iron laws of physics. Turn the temperature up further and even wilder things can happen.

Yet writing well isn’t about using less frequent words or having more frequent tangents. Great writing forges alternative pathways with intentionality. Styles and directions are not shifted randomly, but as needed to strengthen one’s case or to jolt the reader after a span of more mundane prose. For instance, my writing style for this newsletter, although less serious and less formal than my academic writing style, nevertheless is prone to use the phrase “for instance” and the word “nevertheless.” My sentences tend to be longer than those you might encounter in more casual writing, and I generally avoid starting a sentence with “Anyway,” or ending a sentence with an exclamation point. But sometimes, to underscore my argument, I do use an exclamation point!

Anyway, dialing up the temperature creates variability, leading to different responses to the same prompt; an improvement. But this hack is only on the output side of the LLM; by the time the snowball is rolling around, those hills and valleys are already firmly sculpted by the preprocessing of a distinct slate of texts. In other words, the input of the LLM has already been determined. With many of the LLM-based tools we are encountering today, those corpora are incredibly large and omnivorous. ChatGPT is an indiscriminate generalist in what it has read, because it wants to be able to write on virtually any topic.

Here again, however, there is an obvious issue. Good writing isn’t just the selection and ordering of words, the output; good writing is the product of good reading. Writers aren’t indiscriminate generalists, but tend to be rather choosy and personal about what they read. As humans they also have a fairly limited reading capacity, which means that their styles are highly influenced by idiosyncratic reading histories, by their whim. Good readers can often discern which writers a writer has read, as little stylistic quirks pop up here and there — a recognizable artisanal blend, mixed with some individually developed ingredients. It is hard to see how great writing can come from a model that is a generalist, or from a prompt asking for “a story in the style of” just one writer, or even from an LLM trained on a discerning, highbrow corpus, although each of those might have interesting, skillful outputs.

If we want our LLMs to be truly variable and creative, we would have to train the models not on a mass of texts or even the texts of a set of “good writers” (if we could even agree on who those are!), but on a limited, odd array of texts one human being has ingested over their lifetime, which they think about in relationship to their experience of life itself, and which they process and transform over time. And this begins to sound a lot like a story in the style of Jorge Luis Borges, in which a machine seeks to become a writer to impress human beings, and so it asks someone to assemble a library of great works, and the machine waits patiently for years while its human assistant, engrossed by what they are reading, piles up books next to a comfortable chair.

Subscribe to the Humane Ingenuity newsletter:

February 27, 2023

Humane Ingenuity 45: What AI Tells Us About Art

(“The Library of the Distant Future,” as envisioned by Midjourney, when I was let into the beta in March 2022.)

Before one can become a Cassandra or Pollyanna about the uses or abuses of impressive text-to-image AI tools like DALL•E and Midjourney, it is worth stepping back and reflecting about the fundamental nature of this new technology. What is it actually designed to do?

Just as text generators like GPT-3 are engineered to provide highly plausible sequential arrangements of words, these AI image generators are designed to meet our expectations, visually. This agreeableness is right there in the math, in the way these tools distill millions of images into a multidimensional array of the proximities of various styles and shapes. They angle to be familiar, and from what we have seen so far, they are succeeding.

Note that this familiarity and agreeableness doesn’t mean they won’t surprise us from time to time, or even delight us. Clever questioners have coaxed unusual outcomes out of these tools with creative incantations. But even those images in some way meet our expectations, as they must; they are internally structured, like a golden retriever, to be pleasing to the incantor. I will gladly admit that Midjourney’s rendering of my request to conjure a “Library of the Distant Future” elicited an audible “wow” when it appeared. But then again, it also competently echoed the science fiction book covers of my childhood.

One can’t be churlish about this. I applaud the creativity that has gone into the design of these new tools, and they can be great fun. But they also helpfully highlight, by contrast, the nature of truly creative art.

The best art isn’t about pleasing or meeting expectations. Instead, it often confronts us with nuance, contradictions, and complexity. It has layers that reveal themselves over time. True art is resistant to easy consumption, and rewards repeated encounters. Accomplished paintings challenge easy or unitary interpretations, like Mona Lisa’s smile. The best books are worth reading multiple times, as we discover new elements and are affected differently each time we flip their pages.

As this new field of “AI art” develops, we should push for a higher-order Turing test: Are we inclined to view or read their outputs more than once, to ponder their deeper significance? Or, no matter how remarkable they may be, despite immediate, uncanny evocations of delight or humor or dread, do these images still exhaust their artistic reserves rapidly? If so, what does that tell us?

Complexity can be added to the machine; technologists are surely working on it. But the fundamental urge to meet expectations forms a major developmental barrier.

An AI text generator very well might spin a decent tale about a monomaniacal hunt for a white whale, perhaps even with copious Biblical references, given the right additional nudges, but would that work ever have the strange richness produced by a human writer familiar with the actual manual labor of whaling, and who is able to find layers of meaning in those seemingly mundane processes? An AI music generator very well might create the chord progression and melody of a decent song, with lyrics about envy and marriage, but can it record the heartbreaking plaintiveness of “Jolene” without the human experience of Dolly Parton?

The desire of AI tools to meet expectations, to align with genres and familiar usage as their machine-learning array informs pixels and characters, is in tension with the human ability to coax new perspectives and meaning from the unusual, unique lives we each live. Dolly Parton and Herman Melville worked within genres, and had their own arrays of common references, but they also exploded them in ways that could not be anticipated. That is a different sort of delight, and art.

Subscribe to the Humane Ingenuity newsletter:

October 17, 2022

Humane Ingenuity 44: Bookwork and Cloud Labs

Screen Shot 2022-03-09 at 11.32.11 AM.png

(Sara Gothard, Library of Babel, 2017, in the Jamaica Plain branch of the Boston Public Library, part of their recently digitized art collection.)

We have become familiar with how technology, media, commerce, and forms of human expression are deeply intertwined. Streaming music services and apps like TikTok, and the models behind them, encourage the production of shorter songs that begin with the catchiest riff in the track, so as to maximize quick streams and thus revenue; similarly, when radio airplay was the primary way to push the sales of singles, it helped if the lyrics of a potential hit began with the title of the song, sung before the dial could be turned (just ask the incomparable Nile Rodgers). Despite the protestations of the New York Times art critic, NFTs clearly encourage a specific kind of art, namely one with slight variations on a theme rather than artistic diversity — even on serious subject matters — since the centrality of crypto exchange prods artists to think about the community of owners rather than other audiences for their art, like art critics or the general public.

There is no model consciously shaping the form of this newsletter, or some of the other free newsletters that I read. Obviously that’s not the case for Substackers and big media newsletters. Yet there does seem to be a fairly common format in the genre of the newsletter — one of assemblage and pastiche with commentary, shown here.

For that reason I found Whitney Trettien’s wonderful new book Cut/Copy/Paste to be perfect reading for those of us trying to understand and design the new forms of writing, like newsletters, that have surfaced in digital media. Trettien’s book is a diverse collection of assemblages, including religious books that have been spliced together, reimagined works of poetry, and scrapbooks.

Screen Shot 2022-03-16 at 3.11.39 PM.png

Largely from the seventeenth century, these works show the remarkable fluidity of the codex (and the concept of a library) in ways that parallel our current attempts to make use of pixels on screens to transmit knowledge and opinion. The ways that fragments of older works can be spliced and diced together, and how bookmakers could experiment with forms for different audiences, shows how fertile creativity and craft can accompany and improve new media. It should come as no surprise that there are resonant echos in more modern formats like zines and the digital newsletter.

Cut/Copy/Paste is available in an open access web version, with some nice affordances such as full-color, large images and links to digital libraries and archives, but the print version comes in such a delightful size that it’s worth adding to your personal assembly of books.

Two hundred years ago, long before cryptocurrencies, the blockchain, and NFTs, William Blake maintained a (non-distributed) ledger of people who had bought his illustrated Book of Job:

(William Blake, “Subscribers to & Purchasers of the Book of Illustrations of the History of Job,” Beinecke Library, Yale University, 1823-1826.)

Blake’s ledger recorded more success than The Whitworth Gallery’s digital ledger of its minted William Blake NFTs, which in the last year has seen only 10 sales out 52 available nifties. You can get yours for only ~£2000 (varies wildly with the wildly varying price of Tezos cryptocurrency), with proceeds going toward funding social projects in Manchester.

I’ve been intrigued recently by remotely controlled scientific instruments and the emerging idea of the automated lab. It’s not just the whiz-bang nature of such a thing, which seems inevitable given the direction of technology, but the changes it presents to the nature of scientific research. Gaze upon Carnegie Mellon University’s Cloud Lab Project:

During grad school I lived with chemistry doctoral students, who would often carry around multiple timers that would alert them to when they needed to go back to the lab to stir something or add a solution or test a result. (Sometimes it was during the middle of the night.) New machines take care of all of that, including measurements, titrations, the timing of heating and cooling, and many forms of analysis. And more intriguingly, these robotic devices can be placed anywhere and accessed online; all you need to do to run an experiment is write a bit of code. So much for those timers and the campus chem lab.

Once you divorce place, instrumentation, technical lab knowledge, and researchers, the decentralized, automated model reveals some great improvements. In the same way that the digitization of books and resources such as HathiTrust have allowed researchers who are at institutions with smaller libraries, or independent scholars without a research library at all, to do the kind of work that formerly only the privileged could do, the Cloud Lab potentially broadens access to research that usually required a nearby, often very expensive facility. (There are, of course, still huge costs here, and access is limited to those who can pay the costs, thus my emphasis on broadened rather than open access. But still, you can see the latent potential for even wider access in the future.)

Second, because experiments have now been reduced to bits of code — recipes —they can be more easily replicated by others. So we have a fascinating combination of both the beginnings of the democratization of the lab (you don’t need to be at a university with a fancy, pricey lab to do the work), and the ability to re-run experiments at will (you can replicate an experiment with the right recipe and a click).

This combination of enlarged access and increased accountability seems rare in today’s technological environment, and worth highlighting. Too often adding access entails the creeping lack of accountability, putting democratization in tension with responsibility. (Just witness what has happened on the web and in social media.) So I am eagerly following where this lab experiment heads once it scales (I hope) beyond wealthy universities.

Here’s much more from CMU’s team on the concept and function of the Cloud Lab.

Ah, the 1980s, when computer magazines had joyous, goofy illustrations in them.

Screen Shot 2022-03-14 at 2.16.38 PM.png

(Peter Bentley, illustration for “The Art of the FOR…NEXT Loop, Input Magazine, 1984, Vol. 1, No. 1, p. 16.)

Subscribe to the Humane Ingenuity newsletter:

March 16, 2022

Humane Ingenuity 43: Your Own Personal Paul McCartney

Whenever I check out a library book that has been underlined or annotated, I think about the two anonymous students who aggressively marked up Widener Library’s copy of Rollo May’s Man’s Search for Himself:

Screen Shot 2022-01-21 at 10.58.44 AM.png

I hope these two students did in fact meet at some point, although they may have been separated by decades. It would make for a good short story or film (or U2 song).

I also happen to love this passage from Rollo May’s book, which is incredibly relevant to the Humane Ingenuity newsletter.

After some much-needed idleness, being rather than doing, I am back to affirm our relatedness and provide a new year of creative expression. Thanks as always for your readership.

Joel Willick, an engineering student at Northeastern University, has created a delightful robot named Bob ROS, an excellent play on the late, great Bob Ross of PBS’s cult hit “The Joy of Painting.” Bob ROS (Robotic Operating System) analyzes paintings and then tries its best to make a rough approximation.

Screen Shot 2022-01-21 at 2.19.52 PM.png

Bob ROS is not a great painter. It often makes unexpected, almost whimsical errors. But Willick highlights how that is a key part of its charm:

I think Bob ROS is a good influence on the robot’s design philosophy because the mistakes that the robot makes are part of the art that it creates. The art is not just the things it intentionally does, but the things it unintentionally does. And I think embracing the mistakes of robots is just as important as embracing the mistakes of humans.

Yes. Bob Ross’s (and Bob ROS’s) “happy accidents” provide insight into incorporating artificial intelligence into the creative arts. From the conventional prose of GPT-3 to the familiar images of Wombo, AI is getting very good at mimicking genres of human expression. But it still has its weird glitches, and the human brain, in its constant search for expected conformity in textual and visual fields, wants to put those errors in place, to make them make sense. Our need for coherence transforms the artificial into something recognizable and perhaps even wonderful.

It is in this interplay between human interpretation and computational output, both the “normal” and the “odd,” where something fascinating may happen — the possibility or spark of a new story idea or way of painting. In this scenario, AI might be your future creative partner, a digital Paul McCartney who has memorized and digested thousands of songs and genres and internalized their patterns, and can tirelessly riff on the clichéd and the catchy, until some unexpected new fragment emerges for you to develop.

After reading Humane Ingenuity #42, which featured his art and experimentation with NFTs, photograper Noah Kalina gave me a call, which was a kind gesture. (For the record, I don’t like tagging people on social media, which feels like a rude attention-getting ploy, but in this case I should have alerted Noah to my piece, which he found on his own; I extended a mea culpa on the phone.)

Noah is a disarmingly nice and thoughtful person, and we had a genuinely fun conversation. Some of you may consider this an oxymoron, but he is a thinking person’s NFT representative, and I wish — as, I believe, does he — that NFTs had more boosters who were less boisterous.

(Noah Kalina, “Diagonal 1, 20150828,” from the latest edition of his excellent newsletter, which you should subscribe to right now.)

Listening to Noah recount how he went from skeptic to convert — although still with some hesitancy and a frank recognition of the concerns of NFTs’ opponents — I could understand where he was coming from. I won’t relay the specifics of our conversation, but beyond Noah’s own experience, it is clear that some NFTs (like the ones on Lumberland and its neighbors, not those bored apes) exist not only because of tech bro utopianism, but because of numerous institutional and market failures.

We live in a time when it is hard for creative people to get paid a living wage for their work, and that is a tragedy. From music to photography, the pathways to sustainable careers are increasingly and depressingly murky, and the digital realm has largely provided pennies where there used to be dollars. Massive centralized platforms scoop up the majority of the loot. There is also a very unclear preservation path for much digital art, which I’ve explored in prior issues of this newsletter, and years ago in a chapter of Digital History.

I still remain concerned and skeptical about many elements of NFTs, probably because I’m approaching them as a historian and librarian, rather than a photographer or artist. I worry about the claims of decentralized permanence in the blockchain, across not just years but decades and centuries. I wonder about friction with existing conventions around copyright, possession, and accessibility. They may be laggards with technology, but libraries and museums have financial, labor, and social structures that are just as important to the business of maintaining texts and images for the long run.

Even if we solely focus on them as artistic investments, it seems like a major problem that NFTs are denominated in a currency which itself is also highly variable, rather than in a fiat currency. Ether, the cryptocurrency that you can use to buy Noah’s art, has dropped in value by over 50% in the last two months. Imagine that all paintings dropped in value by half, regardless of the artist or individual work of art! That’s off-putting to the broader participation of art lovers and investors, and probably unsettling for even the most committed NFTer.

Anyway, I enjoyed talking with Noah and appreciated his helpful and generous perspective. It would be good to have some further conversations between institutions and artists to see where the former could help the latter. For instance, instead of crypto-based art registration and preservation, could there be some version of Perma.cc for this purpose? That is, a library- and museum-run decentralized permanent record system that artists like Noah could use, without the troubling casino chips of cryptocurrency, and with a better and more robust preservation path for the images themselves?

Screen Shot 2021-12-08 at 12.47.37 PM.png

Kaigai, Tennen, Tennen hyakkaku Volume 1 / [Kaigai Tennen] Volume 1, (Kyōto: Yamada Unsōdō, [Meiji 33-34 [1900-1901]). (Preserved and digitized by the British Library, public domain.)

Subscribe to the Humane Ingenuity newsletter:

January 24, 2022

Humane Ingenuity 42: Not So NFT

(Noah Kalina, Lumberland / 20180716)

Noah Kalina is a gifted photographer who has a commercial practice and also works as an artist. He is probably best known for his Everyday project, in which he has been taking a photograph of himself each day for the last two decades. I am more interested in his nature photography, which is uniformly gorgeous. Noah lives in Lumberland, in upstate New York, and his photos across the seasons — of a single tree or river bend — are evocative and engrossing.

I want to buy a print of one of these photographs, but I can’t, for reasons you can probably imagine, since it is 2021: these remarkable images are only available as NFTs. Thus far, as I write this newsletter, Noah has sold 16 Lumberland NFTs, for a total of 13 ETH (Ether cryptocurrency), which is about $55,000.

Good for him! I want to see Noah’s art supported, and if I can’t throw old-timey U.S. dollars at him in exchange for physical media, I’m glad that he is auctioning off certified links to JPEGs for something equally ethereal. May he convert his ETH to USD ASAP.

But this feeling is bittersweet. Is this how we are going to support the arts and culture in the future? Are books, for instance, going to have associated NFTs? (Seriously, don’t look now.)

Noah’s extraordinary photography is not even in same ballpark as most NFTs, which tend toward disposable doodles and garish digital art. And yet…they are now in the same cinematic universe, with the same cartoonish twists and turns. One of the Lumberland NFTs, which Noah sold just last week for 0.408 ETH ($1,729), was put back on the market for a quick flip. First, it was listed by its owner for the juvenile price of 420.69 ETH (a cool $1,778,845), before it was lowered to 10 ETH ($42,284).

Regardless of artistic merit, because the underlying technology of NFTs is so aggressively decentralized and opposed to traditional forms of institutional, legal, and social forms of trust and value, to succeed they must rely instead on the cohesion that comes from an imagined community (of Bored Apes or VeeFriends), but since such communities often have weak ties — weakened further by online anonymity — they are currently only viable when supercharged by a speculative financial mania.

Noah Kalina may take beautiful photographs, but this is not a pretty picture.

[Further reading: Robin Sloan’s recently published jeremiad, “Notes on Web3,” provides a fuller humanistic rebuke to this creeping financialization of everything, and the creepy notion that all transactions will live on forever in a consumption ledger.]

The world without us: a map of the world with just green spaces and water, by Jonty Wareing:

Screen Shot 2021-11-19 at 1.07.41 PM.png

(The map defaults to London, but you can go anywhere. Above, of course, is Boston.)

Last week in our library, Charlotte Wiman, a Northeastern grad student in paleohydrology, presented some fascinating research about the future of the Mississippi River on a quickly warming planet. She projected forward by looking backward, specifically by finding detailed descriptions of the river and its morphology in old books.

(Plate from Harold Fisk, Geological Investigation of the Alluvial Valley of the Lower Mississippi River, 1944.)

Taking measurements from the maps, cross sections, and diagrams within these books, Wiman and three colleagues were able to generate a hydrological model going back centuries, to a time in the middle ages when last there was a warming trend in the Americas. They then reversed the timeline of this model to see what the Mississippi will look like centuries in the future. Their unsettling conclusion: The mighty Mississippi will be much less mighty, with vastly increased evaporation along its entire pathway.

[Charlotte Wiman, Brynnydd Hamilton, Sylvia G. Dee, Samuel E. Muñoz, “Reduced Lower Mississippi River Discharge During the Medieval Era,” Geophysical Research Letters, 19 January 2021.]

Screen Shot 2021-11-19 at 1.35.23 PM.png

Previously covered in Humane Ingenuity: the potent combination of human expertise and AI processing. A lingering question: how much “human” is needed? In a new paper on the identification of galaxy types, “Practical Galaxy Morphology Tools from Deep Supervised Representation Learning,” Mike Walmsley, Anna M. M. Scaife, et al. find that you don’t need much. Given a relatively small number of human-categorized shapes — just around 10 examples — machine learning tools can extract similarly shaped clusters from nearly a million examples with near 100% accuracy.

They have even built a little interface so you can find galaxy shapes yourself.

Meanwhile, back here on Earth: “For legible pages from World War I handwritten diaries held at the State Library of Victoria, AI services are able to correctly transcribe them at a level between 10% to 49% accuracy.” Not great! Understanding century-old cursive handwriting may end up being one of the hardest problems in AI/ML.

(Sofia Karim, Lita’s House – Gallows (ফাঁসির মঞ্চ) / I (detail), 2020, photographic drawing, from the new Infinitude exhibit at Northeastern University.)

Subscribe to the Humane Ingenuity newsletter:

November 22, 2021

Humane Ingenuity 41: Zen and the Art of Winemaking

Here are sixteen “sketches of a 3D printer by Leonardo da Vinci,” as envisioned by AI using those words as a prompt:

By Rivers Have Wings and John David Pressman, using a CLIP-guided diffusion.

Multimedia essays from the Plant Humanities Lab were recently posted and are worth a look. These pieces use Juncture, a new open-source tool developed by JSTOR Labs that provides a scholarly version of the scrollingtelling that has become common in more mainstream media since the New York Times published “Snow Fall: The Avalanche at Tunnel Creek.” Interactive maps, archival images, and text are blended into a compelling narrative about specific plants, and behind the scenes the PHL connects to repositories of data, art, books, and articles about those plants, including WikiData, JSTOR, Artstor, and the Biodiversity Heritage Library.

Screen Shot 2021-10-08 at 3.32.05 PM.png

All of this tech is in the service of rich narratives about the origins and cultural roles of plant life. Despite the variety, there are common themes: the loss of biodiversity; how commerce and cultural exchange have been global for millennia, not just decades; how imperialism made those interactions extractive and deeply troubling; and yet also how indigenous cultures and profound local knowledge have managed to shape all of our lives, food, art, and beliefs.

A good place to start is with the strange history of the banana, as illustrated by Ashley Buchanan in one of her Juncture-powered PHL pieces.

Screen Shot 2021-10-12 at 1.13.24 PM.png

The alphabet of banana varieties

Wine is one of the oldest plant derivatives. A year ago, at the lowest point of the pandemic, our family decided to try our hand at making wine. We had several hundred pounds of grapes shipped to us from the West Coast. What followed was great fun; I heartily recommend the entire experience, from pressing the grapes to bottling.

There were a lot of steps in between, but there is really not that much to making wine — is it, after all, just the fermentation of grape juice. However, small chemical traits and decisions can change the profile of a wine dramatically. Little additions the size of a thimble in a large barrel might radically alter the ultimate composition.

In addition to the reds we made from Zinfandel grapes from Sonoma, we made three batches of Chardonnay from grapes grown near the Columbia Gorge in Washington, and we altered each batch with minute variations: one using a common American yeast, one using a rare yeast isolated from the Rhône Valley in France, and a third in which we added a secondary process, called malolactic fermentation, using bacteria from Oregon. Remarkably, even though 99% of the liquid is the same in each batch, they taste subtly but noticeably distinct. Much of this comes from the differing balance of three acids — tartaric, malic, and lactic. Malolactic fermentation, as the name suggests, transforms malic acid into lactic acid, and the taste from a sharper profile into a rounder, milkier one. (Yes, lactic as in milk.)

As with all of our contemporary pursuits, apps and computational methods are available to guide you. You can fill digital notebooks with measurements and analyses of the acids, sugars, and other chemical properties of the juice. If you really want to nerd out, you can download gigantic spreadsheets that allow you to tweak those properties through extensive calculations.

Screen Shot 2021-10-21 at 3.53.47 PM.png

Pixar, the legendary animation studio, says that their process of rendering computer graphics is about “turning math into emotions”; similarly, winemaking, among some practitioners, is the process of turning math into taste.

But winemaking is instructive in our technological age because it pushes back against these algorithms. It may be chemical, but it also feels alchemical. The long time horizons of winemaking — it takes a year or more to assess how your handiwork came out — delays and clouds your knowledge, and renders the math uncertain. Winemaking requires decades of trial and error to become truly proficient, and one must accept that not everything can be completely controlled.

Relax and turn off that computer. In vino, mysterium.

Screen Shot 2021-10-21 at 4.13.06 PM.png

Another pathway to zen: place a raindrop anywhere in the United States and watch where it ends up, and how it gets there through streams and rivers.

Screen Shot 2021-10-14 at 11.52.26 AM.png

Screen Shot 2021-10-14 at 11.51.19 AM.png

Ted Underwood on neural models of language:

The immediate value of these models is often not to mimic individual language understanding, but to represent specific cultural practices (like styles or expository templates) so they can be studied and creatively remixed. This may be disappointing for disciplines that aspire to model general intelligence. But for historians and artists, cultural specificity is not disappointing. Intelligence only starts to interest us after it mixes with time to become a biased, limited pattern of collective life. Models of culture are exactly what we need.

This is exactly right, and well put. In a prior issue of this newsletter, I’ve called this experimenting with “stereotypical narrative genres,” and as a historian, it’s useful. The computer, by providing limitless examples of a “style,” can help us see the contours of that style, its peculiarities of language use and in turn its cultural context and connections.

[“Mapping the Latent Spaces of Culture: To Understand Why Neural Language Models Are Dangerous (And Fascinating), We Need to Approach Them as Models of Culture,” by Ted Underwood.]

Alvina Lai on the video game Spiritfarer, which has an unusual character…

…the Collector, a well-dressed, finicky walrus who goes by the name of Susan. When the player first meets Susan, she describes her distaste for the collection of “junk.” Nonetheless, Susan is the in-game collections achievement tracker, and will reward the player for finding objects throughout the game. While Susan is a minor character in-game, “collection management” is a real-world profession integral to museums, libraries and other curatorial institutions of all themes and sizes. This post will discuss Susan’s role and her views on collecting, and compare Susan and her ideas to real-world institutional collection management practices. In comparing real-world collection practices with those in Spiritfarer, I hope to show how Spiritfarer’s “reconstructive” storytelling and its collection mechanics can help shed light on the memory function of collection management policies and practices of libraries and museums.

Subscribe to the Humane Ingenuity newsletter:

October 22, 2021

Humane Ingenuity 40: In Sight

I’m back from a summer hiatus — perhaps not into the carefree fall I (and you) had hoped for. But with students streaming once again into my library, the beginning of this academic year still has that rejuvenating anticipation of new experiences and encounters — a prompt for all of us to shake out of our complacency, to open ourselves once again to new ways of seeing.

Seeing as an underexplored, strange experience animates the art of James Turrell. Our family made one of our pilgrimages to Mass MoCA to see the new Turrell exhibit “Into the Light,” which I recommend if you can make the journey to the far northwestern corner of Massachusetts. The exhibit restages some of his classic approaches to abstract lightwork, including a room where a floating pink cube is actually, somehow, an inset into a curved wall, and darkened spaces with just enough reflected light to confuse and, ultimately, enthrall.

Turrell’s art can be read on many levels, and I am too amateur an art critic to give it that that proper multilevel reading, but my shorthand for what he is trying to do — beyond art and architecture’s traditional interest in color, form, space, and the interactions thereof, and the presentation of some deeply engaging, often transcendent experiences, like his now ubiquitous Skyspaces — is the disaggregation of seeing itself.

Two years ago on an episode of the What’s New podcast, I interviewed Ennio Mingolla, the head of the Computational Vision Laboratory at Northeastern University, and Ennio briskly shook up my ill-conceived, almost comically oversimplified notions about seeing. Human sight is not even close to a representation of the world around us, with the eye like the megapixel sensor at the heart of a digital camera. Instead, it is an aggregation of many distinct skills we have accumulated over the course of evolution, such as the ability to separate objects from backgrounds, the sense of when an object is coming toward us or moving away, and the talent of discerning colors at the periphery or in the center of our field of vision. Together, through a mysterious process in the brain, these elements are nearly instantly synthesized into something comprehensible, appearing as ho-hum as a hotel lobby painting.

Turrell rips that visual complacency apart, presenting to the eye profoundly abnormal situations that confront us with the wonder of vision itself. In his most powerful work in the Mass MoCA exhibit, shown above, you are placed into a cavernous room with no defined edges and an ever-shifting “screen” of color. In this environment, your brain cannot perform its tricks: it is unclear how far away the walls and screen are, or even if they firmly exist, and your peripheral vision and focal vision seemingly reverse their roles.

Because your regular assembly of sight has been scattered — the magic of the synthesis dispelled — you “see” new things. Subtle transitions between the screen tones make the room feel like it’s in a hazy cloud; the color you see on the backs of your eyelids when you blink changes repeatedly and begins a conversation with your open-eyed view; and when a strobe light startlingly comes on, for the first time you see…well, I don’t want to ruin the whole thing for you. Go see it. (And if you do, get reservations weeks in advance; they only let a dozen people in at a time, which makes it even more special.)

The larger lesson of Turrell’s art is that our apparently obvious views are complicated composites that should be challenged and deconstructed. Do not be lulled by the faux Rothko and mellow Musak in the hotel lobby. As this season of the Humane Ingenuity newsletter begins, I invite you to put yourself in the headspace of a first-year college student, curious and skeptical, averting your eyes from our monochromatic media landscape as you seek a more subtle and colorful world.

[For some of my previous writing on Mass MoCA exhibits, see “The Artistic and the Digital” (2007); “Sol LeWitt and the Soul of Creative and Intellectual Work” (2008); “For What It’s Worth: A Review of the Wu-Tang Clan’s ‘Once Upon a Time in Shaolin’” (2016)]

A visualization of the logos of 5,837 metal bands, grouped by theme, similarity, font, and “13 Dimensions of Doom”:

Screen Shot 2021-09-08 at 9.04.24 AM.png

Extra headbanging points awarded for the creative use of:

Screen Shot 2021-09-08 at 9.13.18 AM.png

An editor’s note about my media production: Over the summer, and as I have to remind myself to do every few years, I once again consolidated what I write (and broadcast and post) on my online home for the last twenty years, dancohen.org. I am still planning to use Buttondown to send this newsletter to those who like to receive it by email; I like supporting small developer shops, and Justin does a great job with the mechanics of newslettering. But I’m moving finished issues back over to my own domain, so they can commingle and be archived with my other work rather than living elsewhere.

For those new to Humane Ingenuity, that means that you can now access back issues on my own domain, and that’s also where you can subscribe (as always, for free) to the newsletter.

September 10, 2021

Humane Ingenuity 39: A Circle of Keytars

(Alice Baber, Noble Numbers, 1964-1965, acrylic on canvas, Smithsonian American Art Museum.)

The Leventhal Map & Education Center has a new tool called Moviemaps that allows you to pair an explanatory video with a map (or maps) in a separate window, and that can contain hidden triggers within the video that zoom and pan the adjacent map to highlight certain elements. At the same time, viewers can diverge at any time from the guided script to explore the map on their own. It’s a good model and a clever use of IIIF (the International Image Interoperability Framework).

Here’s a fun, brief introduction by Garrett Dash Nelson, with zooming and snapping hand motions reminiscent of Minority Report:

And here’s a fuller example on the history of trolleys in Greater Boston:

Note that this will work for other digital images, not just maps. I can imagine many uses for education and explanatory journalism.

Speaking of journalism: If news is the first rough draft of history, then we are in clear danger of losing that draft, and with it, considerable knowledge. This fact is made clear in a new report from the Donald W. Reynolds Journalism Institute at the University of Missouri.

There currently is no clear pathway from the systems holding born-digital news content today to some version of publicly accessible archives of the future. It does seem apparent from this study that, despite many risks and challenges, contemporary news content is still valued enough to not be deleted. The window of opportunity for long-term preservation is still open for a great deal of born-digital news content, but that opportunity will not last indefinitely.

As Richard Ovendon makes clear in his outstanding recent work, Burning the Books: A History of the Deliberate Destruction of Knowledge, we can lose precious texts through conscious neglect, not just events like the devastating fire that destroyed the University of Cape Town’s library last month.

The report underscores how news now largely originates in complex and precarious back-end platforms—often a “headless CMS,” or a content management system that reporters and editors work in—and from which a newspaper’s website or app is rendered on the fly, with completely separate code.

The report also makes clear that we can’t simply rely on the Internet Archive to save the news for the future. IA’s web crawler can only save publicly accessible URLs and even then only every so often, and given how much more is in the headless CMS and how many news orgs increasingly gate their content, so much is already being, or will be, lost.

I like the conclusion of the report, which is similar to what the Perma project has done in the legal realm with digital content that is linked to from court decisions: proactively create an alliance between news organizations and libraries, which can act as trusted long-term repositories for the full content of the CMS.

My household’s favorite song from the Eurovision contest this year was Iceland’s Daði og Gagnamagnið’s “10 Years.”

Their performance featured an iconic moment when they combined their three arc-shaped keytars into a joyous synth circle:

A short documentary about how they crafted the keytars made me like the band even more, revealing a dual passion for high-tech electro pop and low-tech woodworking methods.

(Roger Brown, Natural Bridge, 1971, oil on canvas, Smithsonian American Art Museum.)

May 25, 2021

Humane Ingenuity 38: The Vigoda Verification

Sixty years ago, illustrator Arthur Radebaugh drew scenes from the future — that is, our present — including, quite presciently, remote education and work, self-driving cars, and an “electronic home library.” His Sunday strip that ran in newspapers, including the Chicago Tribune, was called “Closer Than We Think.”

(Recliners + the scrolling text of a book on the ceiling, yes please.)

Like the colors he used in the strips, all of this was bright and optimistic, and, of course, only half of the story. Within these frames, Radebaugh could not tackle, say, the complexities and drawbacks that would inevitably accompany a “one-world job market” or “push-button education.”

Viewing Radebaugh’s work today is jarring, and oddly, it now seems monochromatic. With bestsellers such as Kim Stanley Robinson’s The Ministry for the Future, this past year has been a reminder that at its best, science fiction can act both as entertainment and as a kind of cognitive behavioral therapy, letting us imagine and visualize the future in advance, and thus, hopefully, temper its stress and impact. The problem for the SF author, as always, is how to balance wide-eyed amazement with more realistic engagement with the implications of what is to come.

—-

In the latest issue of the Journal of Digital Media Management, Dony West, Caitlin Denny, and Rebecca Ruud, archivists at Paramount Pictures, have an interesting case study: “Integrating Artificial Intelligence Metadata Within Paramount’s Digital Asset Management System.” They ran AI tools over a hundred years of film stills from the legendary movie studio.

After reading this article, I hereby declare that we replace the Turing Test with the Vigoda Verification, in which the quality of an AI system is a measure of its ability to identify Abe Vigoda in a large corpus of images:

By using separate fields, the team can compare the AI-powered metadata against the metadata provided manually. A great example of the difference in keyword results can be seen by doing an ‘all content’ search and comparing numbers of images tagged with Abe Vigoda, the beloved actor from ‘The Godfather’ franchise.
The manually tagged metadata resulted in 119 images of the actor; in contrast, the AI-powered metadata fetched only 22 images from the same set of photographs.

Not great, but the authors note that this is just a start and the direction is promising:

Most of the new and correct ‘Celebrity’ tags created by the AI platform were of crew members on set and political figures at events — people that the librarians would not normally tag in the ‘Keywords’ field. For example, the AI tagged New York City Mayor Abraham Beame on the set of ‘Three Days of the Condor’ (1975) — someone who had not previously been manually tagged. The AI celebrity detection was also able to tag the same celebrity across their lifetime, recognising Gloria Swanson from her 1924 role in ‘Manhandled’ as well as in the 1950 film ‘Sunset Boulevard’.

Still:

There are lesser known talent or non-starring actors that rely on manual tagging based on the team’s personal entertainment and film knowledge…A Stills Archive team favourite ‘Keywords’ tag is of Catherine Coulson, most recognisable as the Log Lady in David Lynch’s ‘Twin Peaks’ television series, showing up as First Assistant Cameraperson on ‘Star Trek II:The Wrath of Khan’. This is someone who would not normally be listed in the set of talent to tag, but the team’s personal knowledge was used to identify and tag her. She was not tagged by the AI platform.

The Coulson Conundrum leads us once again to a common theme of this newsletter: We need to imagine a healthier, more productive collaboration between human experts and raw AI power. These processes and systems are yet to be developed, but seem like a key project for the 2020s.

Global Hip Hop Studies is a new academic journal, and I found one of its first articles utterly fascinating. Ethiraj Grabriel Dattatreyan and Jaspal Naveel Singh present an ethnographic study of a small rap studio in Delhi, India, in a way that borders on the cinematic. The drama follows the collision of local culture and a group of friends with YouTube and GarageBand and the world far beyond Delhi, with unexpected results.

A decade ago, Singh brought some do-it-yourself recording technology to the south side of Delhi, and introduced it to a cluster of young MCs who were working together on dance moves and rap styles. The impact of music tech and the internet — how a genre and culture that began in the Bronx made its way around the world, and through videos and MP3s posted online, to this corner of a huge city in India — is amazing to watch:

Sonal, a Sikh b-boy, who had travelled on the metro for over an hour from his working-class neighbourhood in West Delhi to participate in the day’s session, stood in the narrow area between the bed and the recording equipment. As he sat, he quietly and intently watched as Singh demonstrated how the recording and music production technology that he brought with him from Germany worked. A group of young men was on the veranda just outside the apartment, where Singh had placed a small cot and a couple of plastic chairs. They were huddled around a smart phone listening to a new track on YouTube that one of them wanted to share with the others; a Nigerian hip hop-inspired pop song recorded by an underground artist from Lagos.
All of these young men had been b-boying for several years prior to Singh’s and my arrival in Delhi to conduct research on the local hip hop scene, a scene each of us had got wind of through underground hip hop networks in our respective national contexts – the United States and Germany. Both of us were curious about how these young men living on the margins of Delhi’s explosive growth and development in the last ten years had found hip hop and had each, respectively, travelled to Delhi to do ethnographic research in the scene. As we got to know them, it became evident that the infrastructural imaginaries, made possible as a result of 3G and 4G network expansion in India, allowed these young men who live in the marginal habitations of Delhi to access and make b-boying their own. By watching YouTube videos and connecting with b-boys from all over the world on social media, they learned the latest takes on classic dance moves that originated in the South Bronx over five decades ago. Videos of b-boys from Seoul, Marseille, New York and Los Angeles taught them how to top rock, baby spin and airflare.

The concept of infrastructural imaginaries is new to me, but once you hear it and think about it, it sticks.

Sonal, by the way, is a pseudonym, and he went on to become one of the most famous rappers in India. And while that sounds like a great ending to the drama detailed in the article, the authors note the sad downside to what happened. The DIY tech that transfixed the young MCs in South Delhi ultimately ended up pushing them away from each other. What began as some teenage friends huddled around a smartphone, watching hip hop videos from America, France, Korea, and Nigeria, and then dancing those moves and rapping those verses together across the streets of their city, is lost as the ability to record and market themselves individually inevitably takes over.

A reminder that what technology and the internet can bring together, it can just as easily pull apart.

(E. Gabriel Dattatreyan and Jaspal Naveel Singh, “Ciphers, ‘hoods and digital DIY studios in India: Negotiating aspirational individuality and hip hop collectivity.” Global Hip Hop Studies, 1(1), pp. 25-45.)

May 5, 2021

Humane Ingenuity 37: Data and the Humanities

If there’s one thing we’ve learned about the many datasets we’ve wrestled with this year, it’s that all the data — every single point — is the result of human decision-making.

These essential words are the lede of a great reflection by Erin Kissane, a co-founder of the COVID Tracking Project and CTP’s managing editor. The project is a terrific case study in humane ingenuity, because what seemed like a straightforward data and technology project — tracking COVID cases across the United States — was in fact primarily animated by skills from the humanities and deeply imbued with a humane spirit.

As majors have sharply declined over the last decade, a thousand verbose defenses of the humanities have been published. But as all writers should know, it’s better to show than to tell, and CTP did a damn good job embodying key methods and ethical choices from the humanities. The project also made the implicit point that data work doesn’t belong, by default or fiat, to STEM fields.

CTP put those values into practice through the foregrounding of uncertainty, context, and care. Although the project compiled reams of numbers, they refused to let those numbers drift off into pure quantitative metrics, and they always noted the potential fallibility of each digit. Human error, at the state or local level or within the project itself, the peculiarities of health reports or highly variable definitions, were all measured and analyzed by CTP staff and volunteers. The data was not just accumulated into a spreadsheet; it was tightly coupled with careful interpretation, glosses, and a close reading of primary sources.

This lack of clarity was present in most of the metrics we collected, and meant that we spent hundreds, maybe thousands, of person-hours reading footnotes in obscure state PDFs and watching press conferences to try to catch any turns of phrase that would tell us what — and who — was really represented in a given figure. Definitional problems substantial enough to shape whole narratives about the pandemic haunted our work all year, and we tried to communicate both the answers we found and the uncertainty we encountered.

Kissane’s conclusion points toward an alternative digitial world that has the humanities at its core:

I suspect that a disciplined commitment to messy truths over smooth narratives would also breathe life into technology, journalism, and public health efforts that too frequently paper over the complex, many-voiced nature of the world.

“Generative Unfoldings” is a new exhibit of fourteen software artworks that adopt a humane perspective — sometimes serious, sometimes humorous.

Philipp Schmitt’s “Curse of Dimensionality” creates, on the fly in your web browser, a pairing of an abstract idea for a figure in a science or philosophy journal, and then illustrates it with random but plausible bits of visualization:

Many of the images Schmitt’s code produces remind me of Chad Hagen’s “Nonsensical Infographics,” a similar kind of critique through design:

Sprawling. Fast-moving. Ephemeral. The US Post operated a gossamer network, capable of rapidly spinning out new tendrils to distant places and then melting away at a moment’s notice.

My former colleague Cameron Blevins has a new book out from Oxford University Press, Paper Trails: The US Post and the Making of the American West. As with the COVID Tracking Project, Paper Trails shows the potency of uncovering or producing a trustworthy, unique dataset. In this case, the data comes from a set of tables in old print volumes, which ended up on a CD-ROM, and then were ported to Dataverse/Github. The migration of this data into a modern format is a great story in itself, as Blevins details in a “data biography”:

Richard W. Helbock, a postal historian and philatelist, published United States Post Offices, an 8-volume series aimed at fellow stamp collectors as “the first attempt to publish a complete listing of all the United States post offices which have ever operated in the nation.” I discovered Helbock’s work in 2013, two years after he passed away. Thankfully, Catherine Clark was still selling her late husband’s work online and I was able to purchase a CD-ROM of the data.

And the outcome is comprehensive and compelling:

US Post Offices is a spatial-historical dataset containing records for 166,140 post offices that operated in the United States between 1639 and 2000. The dataset provides a year-by-year snapshot of the national postal system over multiple centuries, making it one of the most fine-grained and expansive datasets currently available for studying the historical geography of the United States.

Blevins’ data highlights, perhaps better than any other evidence, how the westward expansion of the United States was strongly tied to state power rather than individual or local activity by European settlers, as it was the Post Office infrastructure (linked, of course, to the military and other levers of the state) that enabled the kind of communication network and support lines that eventually led to the seizing of native lands. (Just look at those tendrils shooting west from the Mississippi.)

Blevins created a great companion website for the book with Yan Wu and Steven Braun, who was our data visualization specialist at the Northeastern University Library.

There are many things you can do with this data, all of which is now downloadable.

Related: Justin Gage’s Native American Networks:

A video recording of the panel I mentioned in HI36, on a new platform for digitizing archives and what it might mean for researchers, libraries, and archivists, present and future, is now available. I enjoyed participating in this lively discussion.

Finally, something for newsletter readers who are in the middle of the Venn diagram of cats and information design: Ziyi Zhao, “Understanding Cat Behavior: Using Notational Systems to Represent the Relationship of Cats’ Postures and Facial Expression.”

(via my library colleague Sarah Sweeney)

April 7, 2021