Humane Ingenuity 8: Ebooks: It’s Complicated

René Descartes designed a deck of playing cards that also functioned as flash cards to learn geometry and mechanics. (King of Clubs from The use of the geometrical playing-cards, as also A discourse of the mechanick powers. By Monsi. Des-Cartes. Translated from his own manuscript copy. Printed and sold by J. Moxon at the Atlas in Warwick Lane, London. Via the Beinecke Library, from which you can download the entire deck.)


In this issue, I want to open a conversation about a technology of our age that hasn’t quite worked out the way we all had hoped—and by we, I mean those of us who care about the composition and transmission of ideas, which I believe includes everyone on this list. 

Twenty years ago, literary critic Sven Birkerts reviewed the new technology of ebooks and e-readers for the short-lived internet magazine Feed. They sent him a Rocket eBook and a SoftBook, and he duly turned them on and settled into his comfy chair. What followed, however, was anything but comfy:

If there is a hell on earth for people with certain technological antipathies, then I was roasting there last Saturday afternoon when I found myself trapped in some demonic Ourobouros circuit system wherein the snake was not only devouring its own tail, but was also sucking from me the faith that anything would ever make sense again.

Reader, it was not a positive review. But surely, the two decades that separate the present day from Birkerts’ 1999 e-reader fiasco have provided us with vastly improved technology and a much healthier ecosystem for digital books?

Alas, we all know the answer to that. Ten years after Birkerts’ Feed review of existing e-readers, Amazon released the Kindle, which was more polished than the Rocket eBook and SoftBook (both of which met the ignominious end of being purchased in a fire sale by the parent company of TV Guide), and ten years after that, we are where we are: the Kindle’s hardware and software are serviceable but not delightful, and the ebook market is a mess that is dominated by that very same company. As Dan Frommer put it in “How Amazon blew it with the Kindle”:

It’s not that the Kindle is bad — it’s not bad, it’s fine. And it’s not that on paper, it’s a failure or flop — Amazon thoroughly dominates the ebook and reader markets, however niche they have become… It’s that the Kindle isn’t nearly the product or platform it could have been, and hasn’t profoundly furthered the concept of reading or books. It’s boring and has no soul. And readers — and books — deserve better.

Amen. Contrast this with other technologies in wide use today, from the laptop to the smartphone, where there were early, clear visions of what they might be and how they might function, ideals toward which companies like Apple kept refining their products. Think about the Dynabook aspirational concept from 1972 or the Knowledge Navigator from 1987. Meanwhile, ebooks and e-readers have more or less ignored potentially helpful book futurism.

There have been countless good examples, exciting visions, of what might have been. For instance, in 2007 the French publisher Editis created a video showing a rather nice end-to-end system in which a reader goes into a local bookstore, gets advice on what to read from the proprietor, pulls a print book off the shelf, holds a reading device above the book (the device looks a lot like the forthcoming book-like, foldable, dual-screen Neo from Microsoft), which then transfers a nice digital version of the book, in color, to his e-reader.

Ebooks and print books, living in perfect harmony, while maintaining a diversified and easy-to-understand ecosystem of culture. Instead what we have is a discordant hodgepodge of various technologies, business models, and confused readers who often can’t get that end-to-end system to work for them.

I was in a presentation by someone from Apple in 2011 that forecast the exact moment when ebook sales would surpass print book sales (I believe it was sometime in 2016 according to his Keynote slides); that, of course, never happened. Ebooks ended up plateauing at about a third of the market. (As I have written elsewhere, ebooks have made more serious inroads in academic, rather than public, libraries.)

It is worth asking why ebooks and e-readers like the Kindle treaded water after swimming a couple of laps. I’m not sure I can fully diagnose what happened (I would love to hear your thoughts), but I think there are many elements, all of which interact as part of the book production and consumption ecosystem. Certainly a good portion of the explanation has to do with the still-delightful artifact of the print book and our physical interactions with it. As Birkerts identified twenty years ago:

With the e-books, focus is removed to the section isolated on the screen and perhaps to the few residues remaining from the pages immediately preceding. The Alzheimer’s effect, one might call it. Or more benignly, the cannabis effect. Which is why Alice in Wonderland, that ur-text of the mind-expanded ’60s, makes such a perfect demo-model. For Alice too, proceeds by erasing the past at every moment, subsuming it entirely in every new adventure that develops. It has the logic of a dream – it is a dream – and so does this peculiarly linear reading mode, more than one would wish.

No context, then, and no sense of depth. I suddenly understood how important – psychologically – is our feeling of entering and working our way through a book. Reading as journey, reading as palpable accomplishment – let’s not underestimate these. The sensation of depth is secured, in some part at least, by the turning of real pages: the motion, slight though it is, helps to create immersion in a way that thumb clicks never can. When my wife tells me, “I’m in the middle of the new Barbara Kingsolver,” she means it literally as well as figuratively.

But I think there are other reasons that the technology of the ebook never lived up to grander expectations, reasons that have less to do with the reader’s interaction with the ebook—let’s call that the demand side—and more to do with the supply side, the way that ebooks are provided to readers through markets and platforms. This is often the opposite of delightful.

That rough underbelly has been exposed recently with the brouhaha about Macmillan preventing libraries from purchasing new ebooks for the first eight weeks after their release, with the exception of a single copy for each library system. (I suspect that this single copy, which perhaps seemed to Macmillan as a considerate bone-throw to the libraries, actually made it feel worse to librarians, for reasons I will leave to behavioral economists and the poor schmuck at the New York Public Library who has to explain to a patron why NYPL apparently bought a single copy of a popular new ebook in a city of millions.)

From the supply side, and especially from the perspective of libraries, the ebook marketplace is simply ridiculous. As briefly as I can put it, here’s how my library goes about purchasing an ebook:

First, we need to look at the various ebook vendors to see who can provide access to the desired book. Then we need to weigh access and rights models, which vary wildly, as well as price, which can also vary wildly, all while thinking about our long-term budget/access/preservation matrix.

But wait, there’s more. Much more. We generally encounter four different acquisition models (my thanks to Janet Morrow of our library for this outline): 1) outright purchase, just like a print book, easy peasy, generally costs a lot even though it’s just bits (we pay an average of over $40 per book this way), which gives us perpetual access with the least digital rights management (DRM) on the ebooks, which has an impact on sustainable access over time; 2) subscription access: you need to keep paying each year to get access, and the provider can pull titles on you at any time, plus you also get lots of DRM, but there’s a low cost per title (~$1 a book per year); 3) demand-driven/patron-driven acquisition: you don’t get the actual ebook, just a bibliographic record for your library’s online system, until someone chooses to download a book, or reads some chunk of it online, which then costs you, say ~$5; 4) evidence-based acquisitions, in which we pay a set cost for unlimited access to a set of titles for a year and then at the end of the year we can use our deposit to buy some of the titles (<$1/book/year for the set, and then ~$60/book for those we purchase).

As I hope you can tell, this way lies madness. Just from a library workflow and budgetary perspective, this is insanely difficult to get right or even to make a decision in the first place, never mind the different ebook interfaces, download mechanisms, storage, DRM locks, and other elements that the library patron interacts with once the library has made the purchase/rental/subscription. Because of the devilish complexity of the supply side of the ebook market, we recently participated in a pilot with a software company to develop a NASA-grade dashboard for making ebook decisions. Should I, as the university librarian, spend $1, $5, or $40, should I buy a small number of favored books outright, or subscribe to a much wider range of books but not really own them, or or or

Thankfully I have a crack staff, including Janet, who handles this complexity, but I ask you to think about this: how does the maddeningly discordant ebook market and its business models skew our collection—what we choose to acquire and provide access to? And forget Macmillan’s eight weeks: what does this byzantine system mean for the availability, discoverability, and preservation of ebooks in 10, 50, or 100 years?

We should be able to do better. That’s why it’s good to see the Digital Public Library of America (where I used to be executive director) establishing a nonprofit, library-run ebook marketplace, with more consistent supply-side terms and technical mechanisms. Other solutions need to be proposed and tried. Please. We can’t have a decent future for ebooks unless we start imagining alternatives.



Onto a sunnier library topic: We are beginning a renovation of our main library at Northeastern University, Snell Library, and have been talking with architects (some of them very well-known), and I’ve found the discussions utterly invigorating. I would like to find some way to blog or newsletter about the process we will go through over the next few years, and to think aloud about the (re)design and (future) function of the library. I’m not sure if that should occur in this space or elsewhere, although the thought of launching another outlet fills me with dread. Let me know if this topic would interest you, and if I should include it here.



On the latest What’s New podcast from the Northeastern University Library, my guest is Louise Skinnari, one of the physicists who works on the Large Hadron Collider. She takes us inside CERN and the LHC, explains the elementary particles and what they have found by smashing them at the speed of light, and yours truly says “wow” a lot. Because there’s so much wow there: the search for the Higgs boson, the fact that the LHC produces 40 terabytes of data per second, the elusiveness of dark matter. Louise is brilliant and is helping to upgrade the LHC with a new sensor that sounds like it’s straight out of Star Trek: the Compact Muon Solenoid. She is also investigating the nature of the top quark, which has some unusual and magical properties. Tune in and subscribe to the podcast!

Humane Ingenuity 7: Getting Weird with Technology to Find Our Humanity

One of the best ways that we can react to new technology, to sense its contours and capabilities, and also, perhaps slyly, to assert our superiority over it, is to get weird with it. There is a lot of heavy thinking right now about the distinctions between artificial intelligence and human intelligence, but sometimes we just need to lighten up, to remember that human beings are oddballs and so we can’t help but use technology in ways that cannot be anticipated. And this weirdness, this propensity to play with technology, and to twist it to human, and humane, ends, should be taken seriously.

In 1993, the artist Spencer Finch, fresh out of RISD, started playing around with a Mac, a VCR, and a Radio Shack’s haul of other technology, including a directional radio wave transmitter, and he came up with “Blue (One Second Brainwave Transmitted to the Start Rigel).” When I first saw it in 2007 at Mass MoCA, it made me smile.

Finch’s gloriously weird conceit in “Blue” is that he would sit in a comfy chair watching a continuous loop of the ocean wave in the opening credits of the TV show “Hawaii Five-0,” and his brain waves from that stimulus would be picked up by a headset, which would be processed by the Mac, amplified, and sent out the window to Rigel, the bluest star in the night’s sky. In addition to the whimsical and smart conceit, “Blue” also had the best deadpan wall text I’ve ever seen in a museum: “Finch’s wave is expected to arrive at its destination in the year 2956.”


In a recent post on “Gonzo Data Science,” Andrew Piper of the .txtLAB at McGill University prompts us to do just this: get weirder with emerging technologies:

I wish data science, and its outposts in the humanities, would get more experimental. By this I mean more scientifically rigorous (let’s test some hypotheses!), but also weirder, as in the Jimi Hendrix kind of experimentation…There’s just not enough creativity behind the endeavour. I don’t mean the “I discovered a new algorithm” kind of creativity. I mean the “I created a new imaginary world that shows us something important about who we are” kind.

Andrew notes that this doesn’t need to be Hunter S. Thompson-level gonzo, but definitely more playful and wide-ranging than current practice, to explore boundaries and possibilities. Let’s get in that lab and start mixing some things up.


Over the last year, Lori Emerson of the Media Archaeology Lab at the University of Colorado Boulder has published two very good articles on the history and use of the Lab, which houses dozens of old computers and media devices, and allows for such playful experimentation. As she recounts in “Excavating, Archiving, Making Media Inscriptions // In and Beyond the Media Archaeology Lab,” Lori began her academic career studying poetry, but then moved into “trying to understand the inner workings of obsolete computers.” In this article, she successfully unpacks why that is less of a jump than one might expect, and what she has learned about how human expression is connected to platforms for writing and reading—and how much can be gained by toying with these platforms.

She also includes some great examples of humans creatively weirding out on digital media, going back decades. This example from the new media artist Norman White, working on a pre-web network called the Electronic Art Exchange Program, in 1985, particularly caught my eye:

White’s “Hearsay,” on the other hand, was an event based on the children’s game of “telephone” whereby a message is whispered from person to person and arrives back at its origin, usually hilariously garbled. [Poet Robert] Zend’s text was sent around the world in 24 hours, roughly following the sun, via I.P. Sharpe Associates network. Each of the eight participating centers was charged with translating the message into a different language before sending it on. The final version, translated into English, arrived in Toronto as a fascinating example of a literary experiment with semantic and media noise:

THE DANCERS HAVE BEEN ORDERED TO DANCE, AND BURNING TORCHES WERE PLACED ON THE WALLS.

THE NOISY PARTY BECAME QUIET.

A ROASTING PIG TURNED OVER ON AN OPEN FLAME…

(Note the similarity of this final output with some recent AI-generated fiction using GPT-2; I’ll return to that in a future HI.)

In “Media Archaeology Lab as Platform for Undoing and Reimagining Media History,” Lori provides a longer history of, and justification for, the Media Archaeology Lab. 

While I am attempting to illustrate the remarkable scope of the MAL’s collection, I am also trying to show how anomalies in the collection quietly show how media history, especially the history of computing, is anything but a neat progression of devices simply improving upon and building upon what came before; instead, we can understand the waxing and waning of devices more in terms of a phylogenetic tree whereby devices change over time, split into separate branches, hybridize, or are terminated. Importantly, none of these actions (altering, splitting, hybridizing, or terminating) implies a process of technological improvement and thus, rather than stand as a paean to a notion of linear technological history and progress, the MAL acts as a platform for undoing and then reimagining what media history is or could be by way of these anomalies.

Yes! Lori then goes deep on the Canon Cat, a weird and wonderful computer from Jef Raskin, whom I covered in HI2.


Mark Sample was the playful Spencer Finch of Twitter bots, back when Twitter was fun. He made 50 bots of all shapes and sizes, but now has soured on the whole enterprise and has written a postmortem, “Things Are Broken More Than Once and Won’t Be Fixed,” about the demise of one of his more creative bots, @shark_girls. The female sharks in question are two great white sharks, Mary Lee and Katharine, actual sharks with GPS devices on them, paired with non-actual shark musings to accompany their ocean wanderings. (My assumption is that “Mary Lee and Katharine” are also not their actual shark names.)

I thought, wouldn’t it be cool to give these sharks personalities and generate creative tweets that seemed to come directly from the sharks. So that’s what I did. I narrativized the raw data from these two great white sharks, Mary Lee and Katharine. Mary Lee tweets poetry, and shows where she was in the ocean when she “wrote” it. Katharine tweets prose, as if from a travel journal, and likewise includes a time, date, and location stamp.

That is, until Twitter and Google put the kibosh on @shark_girls.


From new technology to that endlessly fascinating, multifaceted older technology of the book: Sarah Werner has started a great new newsletter that I think many of you would like: Early Printed Fun. Fun indeed—and also thoughtful about the medium of the codex and its incredible variety.


The Harvard Library has launched a new portal for their digitized collections, including a special section of thousands of daguerreotypes. I don’t know about you, but aside from the gilded frames, many mid-nineteenth-century daguerreotypes look more contemporary to me than mid-twentieth-century photographs. There’s a surprising intensity and depth to them.

Catharine Norton Sinclair

Edwin Adams


This week’s What’s New podcast from the Northeastern University Library covers the use of sensing devices to aid in behavioral therapy. I talk with researcher Matthew Goodwin of the Computational Behavioral Science Laboratory about his studies of children with severe autism. Matthew is trying to create unobtrusive sensors, backed by machine learning analytics, to provide advance warning to caregivers of these children when they are heading into difficult emotional periods, such as harmful behavior toward themselves or others, so the caregivers can guide the children into safer physical and mental spaces.

This is a complex topic, and Matthew’s lab is trying to be sensitive not to overdo it with technological solutions or invasions of privacy. But as he highlights, there are no effective treatments for severe autism, and it is enormously stressful for parents and guardians to monitor children for dangerous episodes, so caregivers are hugely appreciative for this kind of behavioral advance warning system. (Listen|Subscribe)


Last week in HI6 I discussed the Digital Library Federation’s annual forum, but sent out the newsletter before the panel that I was on. That panel, “The Story Disrupted: Memory Institutions and Born-Digital Collecting,” was based on a great new article by Carol Mandel that covers the long history of collecting and preserving “talking things”—artifacts that capture and represent human history and culture—and how that important process has been completed upended by digital media.

I commented on this disruption from multiple perspectives as a historian, librarian, and college administrator, and how each of those roles entails a different approach to born-digital collecting, determining what kinds of artifacts we should collect (historian), how we should collect those “talking things” (librarian/archivist), and how to manage, staff, and pay for this work (administrator).

My fellow panelists included Chela Weber, Trevor Owens, and Siobhan Hagan, who all had helpful insights. Siobhan talked about the movement in public libraries (including hers in DC) to have Memory Labs, in which the public can digitize and preserve their own artifacts.


(Spencer Finch, “Trying To Remember the Color of the Sky on That September Morning,” 2,983 squares of blue to memorialize those who died on 9/11 at the National September 11 Memorial Museum in NYC. Photo by Augie Ray, CC-BY-NC-2.0.)

Humane Ingenuity 6: Walden Eddies + the DLF Cinematic Universe

I am very fortunate to live a short drive from Walden Pond, of Henry David Thoreau fame. With the hordes of summer tourists finally thinning out, and with the leaves changing with the arrival of fall, it’s a good time to stroll around the pond, which we did last weekend.

Those who haven’t taken the walking path around Walden Pond before are generally surprised by several things: 1) it’s rather small; 2) train tracks run right next to the walking path on one side of the pond; and especially 3) Thoreau’s cabin is not that far off the road, and within trivial walking distance of the center of Concord. If Thoreau were alive today, he could, on a whim, go grab some nice warm coffee and a book at a really good book store, and be back in the woods in time to light a fire for dinner.

Those amenities, of course, did not exist in the middle of the nineteenth century when Thoreau took his leave from society, but still, he was only a short stroll from other houses in the area and a mere mile and a half from his family home. New visitors to Walden realize that his off-the-grid life was a little more like grid adjacent.

It struck me on this recent visit, however, that Thoreau’s perhaps not-so-radical move presents something of a model for us as we struggle with our current media environment. Maybe moving just a bit off to the side, removed but not totally ascetic, is a helpful way to approach our troubled relationship with digital media and technology.

Indeed, the “Republic of Newsletters” is just a bit off to the side, existing in niche digital eddies rather than vast digital rivers, using the old-fashioned wonder of email and even the web, but not being so world wide. Sometimes to find your humanity you must step outside of the mass of society and its current, unchallenged habits. But don’t go too far. You should still be able to get a warm cup of coffee and a good book.


The Federation

I’m writing to you from Tampa, where the heat and humidity makes me want to dive into the bracing chill of Walden Pond. The Digital Library Federation is having its annual forum here, and while DLF sounds like it could be a cool Star Trek thing, in actuality its spirit is closer to Thoreau than one might imagine.

The hundreds of practitioners who attend DLF every year—librarians, archivists, museum professionals, software developers, and researchers—have increasingly taken on the responsibility of thinking about how to be deliberate and considerate with our use of digital media and technology—the practice of humane ingenuity. The kinds of questions that are asked here are ones we could easily tailor to other areas of our lives: What are the kinds of human expression we should highlight and preserve, and how can we ensure diverse voices in that record? How can we present images in ways that are sensitive to how different kinds of viewers might see them and use them? How can digital tools help rather than hinder our explorations of our shared culture?

The Federation is now 25 years old. Twenty five years ago there were virtually no digital libraries; now there are countless ones, and some, like the Digital Public Library of America, have tens of millions of items from thousands of cultural heritage organizations. The next 25 years seems to be less about a rapid build-out and more about the hard work of conscientious maintenance and correcting the problems now clearly inherent in poorly designed digital platforms. And there are some exciting new methods emerging to take advantage of new computational techniques, but DLFers are dedicated to implementing them in ways that prevent social problems from emerging in the first place.

It’s great to see some HI readers and old friends here at DLF. Josh Hadro of the IIIF Consortium (part of the DLF Cinematic Universe), helpfully provided some additional examples of the use of AI/ML on digital collections. The Center for Open Data in the Humanities in Japan is using machine learning to extract facial expressions and types of characters in Japanese manuscripts

Insects

Samurai

Josh and I also wistfully remembered the great potential of the NYC Space/Time Directory at NYPL, pieces of which could perhaps be revived and implemented in other contexts…

Amy Rudersdorf of AVP and Juliet Hardesty of Indiana University presented some exciting work on MGMs—metadata generation mechanisms (also part of the DLF Cinematic Universe). MGMs can include machine learning services as well as human expertise, and critically, they can be strung together in a flexible way so that you can achieve the best accuracy from the right combination of tools and human assessment. Rather than using one mechanism, or choosing between computational and human methods, MGMs such as natural language processing, facial recognition, automated transcription, OCR, and human inputs can all be employed in a single connected thread. The project Amy and Juliet outlined, the Audiovisual Metadata Platform (AMP), seems like a thoughtful and promising implementation of AI/ML to make difficult-to-index forms of human expression—such as a concert or a street protest—more widely discoverable. I will be following this project closely.

Finally, Sandy Hervieux and Amanda Wheatley are editing a new volume on artificial intelligence in the library: The Rise of AI: Implications and Applications of Artificial Intelligence in Academic Libraries. They are looking for authors to contribute chapters. Maybe that’s you?

Humane Ingenuity 5: Libraries Contain Multitudes

More on the Use of AI/ML in Cultural Heritage Institutions

The piece by Clifford Lynch that I mentioned previously in HI3: AI in the Archives has now been published: “Machine Learning, Archives and Special Collections: A high level view.” Excerpt:

Some applications where machine learning have lead to breakthroughs that are highly relevant to memory organizations include translation from one language to another; transcription from printed or handwritten text to computer representation (sometimes called optical character recognition); conversion of spoken words to text; classification of images by their content (for example, finding images containing dogs, or enumerating all the objects that the software can recognize within an image); and, as a specific and important special case of image identification, human facial recognition. Advances in all of these areas are being driven and guided by the government or commercial sectors, which are infinitely better funded than cultural memory; for example, many nation-states and major corporations are intensively interested in facial recognition. The key strategy for the cultural memory sector will be to exploit these advantages, adapting and tuning the technologies around the margins for its own needs.

That last bit feels a bit more haunted this week with what’s going on in Hong Kong. Do read Clifford’s piece, and think about how we can return basic research in areas like facial recognition to entities that have missions divergent from those of governments and large companies.

Northern Illinois University has an incredible collection of 55,000 dime novels, that cheap and popular form of fiction that flourished in the United States in the late nineteenth century. As disposable forms of literature, many dime novels didn’t survive, and those that did are poorly catalogued, since it would require a librarian to read through each of these novels from cover to cover to grasp their full content and subject matter.

NIU is exploring using text mining and machine learning to generate good-enough subject headings and search tools for their collection, and an article from earlier this year outlines the process. (Alas, the article is gated; for those outside of academia, you can try Unpaywall to locate an open access version.) Matthew Short’s “Text Mining and Subject Analysis for Fiction; or, Using Machine Learning and Information Extraction to Assign Subject Headings to Dime Novels” is written in a laudably plainspoken way, and reaches some conclusions about a middle way between automated processes and human expertise:

The middle ground between fully-automated keyword extraction and full-level subject analysis might simply be to supply catalogers with a list of keywords to aid them in their work. From such a list, catalogers may be able to infer what those words suggest about the novel.

Yes! There’s a lot of work to be done on this kind of machine learning + human expert collaboration. Matthew has some good examples of how to aggregate unusual keywords into different top-level dime-novel genres, like seafaring, Westerns, and romance.

Next week I’ll be at the Digital Library Federation’s annual forum and will try to newsletter from there. There’s a session on “Implementing Machine-Aided Indexing in a Large Digital Library System” that should provide further grist for this mill.


The Cleveland Museum of Art recently launched a new site for digitized works of art from their collection, with some of them in 3D, including this wonderful 500-year-old piggy bank from Java:


Libraries Contain Multitudes

Several HI subscribers pointed me to Alia Wong’s piece in The Atlantic “College Students Just Want Normal Libraries,” on how students want “normal” things like printed books rather than new tech (or “glitz,” in Alia’s more loaded term) in college libraries, and how it seems to contradict my piece earlier this year in The Atlantic, “The Books of College Libraries Are Turning Into Wallpaper.”

As those same correspondents also discerned, some of the apparent contradiction seems to be due to the disconnect between student self-reporting and actual library indicators; much of what Alia points to for evidence are surveys of what students say they want, while I tried to highlight the unsettling hard data that book circulations in research libraries are declining precipitously and ceaselessly, with students andfaculty checking out far fewer books than they used to. (There may not even be that much of a disconnect on books, as you can see in one of the surveys Alia highlights from 2015.)

Anyway, Alia’s piece is worth the read and I do not include it here for extended criticism. She makes many good points, and the allocation of space within libraries is a complicated issue that all librarians have been wrestling with, as I tried to note in my own piece. Alia and I actually agree on much, including, as Alia writes, the significant need for “a quiet place to study or collaborate on a group project” and that “many students say they like relying on librarians to help them track down hard-to-find texts or navigate scholarly journal databases.” Yes!

Where I do want to lodge an objection, however, is with the notion that I’ve been pushing back against in this newsletter: the too frequent, and easy to fall into, trope of a binary opposition between traditional forms of knowledge and contemporary technology. Or as Roy Rosenzweig and I put it in Digital History, the stark polarization of technoskepticism versus cyberenthusiasm is extremely unhelpful, and we should instead seek a middle way in which we maximize the advantages of technology and minimize its drawbacks. This requires a commingling of old and new that is less about glitz and more about how the old and new can best contribute, together, to our human understanding and expression.

Because so many of us care so much about the library as an institution, it has become an especially convenient space to project, in a binary way, the “normal” or “traditional” versus the “futuristic.” Most universities aren’t building glitzy new libraries, but are instead trying as best they can to allocate limited space for multiple purposes. The solutions to those complex equations will vary by community, and even in self-reported student surveys of what students want out of their library (and our library surveys thousands of students every two years to assess the needs and desires Alia covers), there’s a wide diversity of opinion.

Let’s not fall into the trap of thinking that all students want roughly the same thing, or define “normal” for all libraries; some students want and in fact need tech, while others want quiet space for reading, and many of them move from quiet spaces to tech spaces during the course of a single day. Our library has a room for 3D printers and an AR/VR lab; combined, that “glitz” takes up about 1000 square feet in a library that has well over 100,000 square feet of study space.

The library can and should accommodate multiple forms of knowledge-seeking—and better yet, and most critically for the continued vibrancy of the institution, forge connections between the old and new.

(More on this theme: Last week I was on The Agenda on TV Ontario to talk about my piece in The Atlanticand to discuss those complicated questions about the state of reading and the use of books. Christine McWebb and Randy Boyagoda joined me on the program and had many good comments about how and when to encourage students to engage with books. Watch: “The Lost Art of Reading.”)


On this week’s What’s New podcast from the Northeastern University Library, my guest is Nada Sanders, Distinguished Professor of Supply Chain Management at the D’Amore McKim School of Business, and author of the recently published book The Humachine: Humankind, Machines, and the Future of Enterprise. The conversation covers the impact of automation and AI on factories and businesses, and how greater efficiency from those increasingly computer-driven enterprises is causing huge problems for workers and small businesses. Tune in.

Humane Ingenuity 4: Modeling Humane Features + Teaching a Robot to Crack a Whip

[3D-printed zoetrope from The National Science and Media Museum’s Wonderlab, via Sheryl Jenkins]

Modeling Humane Features

If it wasn’t a famous catchphrase coined by a young social media billionaire, and etched onto the walls of Facebook’s headquarters, the most likely place for “Move fast and break things” to appear would be on a mall-store t-shirt for faux skate punks. But even though Facebook has purportedly moved on from Mark Zuckerberg’s easily mocked motto, it is an ideology that remains widely in use, and that also engenders easy counter-takes, with some variation on “Move slowly and fix things.”

In a newsletter that seeks the contours of humane technology, moving slowly and fixing things seems like an obvious starting point, but it’s not enough and has the potential to be its own vapid motto unless we do a better job spelling out what it actually means. We need some examples, some modeling of how one goes about creating digital media that is Not Facebook. And for that we should revisit my emphasis in HI2on considering the psychology of humans before creating digital technology, and then again, constantly, during its implementation and growth.

In the signature block of this newsletter, you may have noticed that my preferred social media platform is Micro.blog, which I decamped to from Twitter last year in an attempt to cleanse my mind and streamline my online presence. (I am, however, still unclean; I have chosen to syndicate my Micro.blog posts to Twitter, since I still have many friends over there who wish to hear from me.) A big part of what I like about Micro.blog is that moving deliberately and considering things is deeply valued by Manton Reece, the architect of the platform, as well as his colleagues Jean MacDonald and Jonathan Hays.

Manton is admirably transparent and thoughtful about how he is designing and iterating on the platform (so much so that he’s writing a book about it, which surely makes him the most academic of app-makers). Even before Micro.blog’s launch, Manton thought through, and just as important, stuck to, key features—or non-features—such as no follower or like counts, anywhere, ever. (On Micro.blog you can see who people follow, to find others to follow yourself, but not how many people follow you or anyone else.) As Micro.blog has developed, Manton has also consulted with the burgeoning community about new features, to ensure that what he’s doing won’t disrupt the considerate, but still engaging, vibe of the place.

I have much more to say about Micro.blog, including some thoughts about its size, sustainability plan, and approach to hosting and personal data, but for this issue of HI I want to focus on a good case study of ethical technological implementation—a happy accident that turned into a valued (non-)feature. Last year an unintended update to the Micro.blog code suddenly made it impossible for users to see if someone was following them by clicking on their follows list. This seemed odd and in need of a fix, until everyone swiftly realized what a relaxing plus this is in today’s social media. The long thread of this revelation reads like a morality play; here’s a brief excerpt:

@jack: Either no one follows me here or the following list doesn’t include the user looking at the list. This has the effect of hiding whether or not someone follows me. If that’s the case, it’s genius.

@macgenie: @jack I had not realized that we don’t show whether someone is following you when you look at their Following list! That makes sense. / @manton

@jack: @macgenie I think it’s terrific. Avoids the awkwardness around any implied obligation for mutual follows :).

@manton: @macgenie @jack So… this is actually a bug that I introduced last night. But now I’m wondering if it’s a good thing! It’s confusing right now but maybe an opportunity to rethink this feature.

@jamesgowans: @manton I like it. It relieves the worry about whether someone follows you or doesn’t (a useless popularity metric), and places more value in replies/conversations. It’s sort of “all-in” on the no follower count philosophy.

@ayjay: @manton As the theologians say, O felix insectum!

Two weeks ago Rob Walker had a piece revisiting “Move fast and break things” and urging Silicon Valley to have an overriding focus on designing apps with bad actors in mind, for the worst of humanity who might seek to exploit an app for their own gain and to disrupt society. This should undoubtedly be an important aspect of digital design, especially so given what has happened online in the last few years.

But while architects should surely design an office building to thwart arsonists, they need to spend even more time designing spaces for those who are trying to work peacefully and productively within the building. Ultimately, we have to design platforms to withstand not only the worst of us, but the worst aspects of the rest of us. In thinking about seemingly small elements like whether you can see if someone is following you, Micro.blog’s Manton, Jean, and Jonathan get this critical point.


Teaching a Robot to Crack a Whip

In Robin Sloan’s novel Sourdough, the protagonist tries to teach a robot arm how to delicately and effectively crack an egg for cooking. This week on the What’s New podcast from the Northeastern University Library, I spoke to Dagmar Sternad from the Action Lab, who is, among other things, teaching a robot how to crack a whip. (She said that’s a very hard problem, especially to hit a particular point in space with the tip of the whip, but that egg cracking is a very, very hard problem.)

The Action Lab studies the complete range and technique of human motion very closely using the same technology Hollywood uses to create CGI characters—my favorite Action Lab case study looks at how ballet dancers in a duet transmit information through their fingertips to their partners—and then they encode that knowledge into digital and then robotic surrogates through biomimetics.

[Dagmar Sternad with two ballet dancers and a robot arm]

What I also found interesting about Dagmar’s work, and very germane to what I’m trying to do in this newsletter, is the bidirectionality of learning between humans and machines. You will be glad to hear that the Action Lab is not seeking to create whip-cracking robots of doom. But by observing humans doing intricate tasks, and then replicating those actions in the computer and with machines, they can better understand what is going on—and can even alter and simplify the motions so that they remain effective but require less expertise and motion. In turn, they can transfer these lessons back to physical human behaviors. (This is a variation on Sloan’s “flip-flop”: “the process of pushing a work of art or craft from the physical world to the digital world and back.”) In short, they study ballet dancers and engineer robots so that they can find ways to help the elderly walk better and avoid falls.

This was a fun conversation—give it a listen or subscribe to the podcast through the What’s New site.


Follow-up on AI/ML in Libraries, Archives, and Museums

My thanks to HI readers who sent me recent discussions about the use of artificial intelligence/machine learning in libraries, archives, and museums, in response to HI3:

  • Museums + AI, New York workshop notes, from Mia Ridge.
  • The latest volume of Research Library Issues from the Association of Research Libraries is on the “Ethics of Artificial Intelligence”—good timing!“After decades of worries that the popularity of science and technology paradigms threaten humanistic learning and scholarship, it is now becoming evident that unique opportunities are emerging to demonstrate why humanistic expertise and informed considerations of the human condition are essential to the very future of humanity in a technological age.” —Sylvester Johnson

Humane Ingenuity 3: AI in the Archives

Welcome back to Humane Ingenuity. It’s been gratifying to see this newsletter quickly pick up an audience, and to get some initial feedback from readers like you that can help shape forthcoming issues. Just hit reply to this message if you want to respond or throw a link or article or book my way. (Today’s issue, in fact, includes a nod to a forthcoming essay from one of HI’s early subscribers.)

OK, onto today’s theme: what can artificial intelligence do in a helpful way in archives and special collections? And what does this case study tell us more generally about an ethical and culturally useful interaction between AI and human beings?


Crowdsourcing to Cloudsourcing?

Over a decade ago, during the heady days of “Web 2.0,” with its emphasis on a more interactive, dynamic web through user sharing and tags, a cohort of American museums developed a platform for the general public to describe art collections using more common, vernacular words than the terms used by art historians and curators. The hope was that this folksonomy, that riffy portmanteau on the official taxonomy of the museum, would provide new pathways into art collections and better search tools for those who were not museum professionals.

The project created a platform, Steve, that had some intriguing results when museums like the Indianapolis Museum of Art and the Metropolitan Museum of Art pushed web surfers to add their own tags to thousands of artworks.

Some paintings received dozens of descriptive tags, with many of them straying far from the rigorous controlled vocabularies and formal metadata schema we normally see in the library and museum world. (If this is your first time hearing about the Steve.museum initiative, I recommend starting with Jennifer Trant’s 2006 concept paper in New Review of Hypermedia and Multimedia, “Exploring the Potential for Social Tagging and Folksonomy in Art Museums: Proof of Concept” (PDF))

(Official museum description, top; most popular public tags added through Steve, bottom)

Want to find all of the paintings with sharks or black dresses or ice or wheat? You could do that with Steve, but not with the conventional museum search tools. Descriptors like “Post-Impressionism” and “genre painting” were nowhere to be seen. As Susan Chun of the Met shrewdly noted about what the project revealed: “There’s a huge semantic gap between museums and the public.”

I’ve been thinking about this semantic gap recently with respect to new AI tools that have the potential to automatically generate descriptions and tags—at least in a rough way, like the visitors to the Met—for the collections in museums, libraries, and archives. Cloud services from Google, Microsoft, Amazon, and others can do large-scale and fine-grained image analysis now. But how good are these services, and are the descriptions and tags they provide closer to crowdsourced words and phrases or professional metadata? Will they simply help us find all of the paintings with sharks—not necessarily a bad thing, as the public has clearly shown an interest in such things—or can they—should they—supplement or even replace the activity of librarians, archivists, and curators? Or is the semantic gap too great and the labor issues too problematic?


Bringing the Morgue to Life

Pilots using machine vision to interpret the contents of historical photography collections are already happening. Perhaps most notably, the New York Times and Google inked a deal last year to have Google’s AI and machine learning technology provide better search tools for their gigantic photo morgue. The Times has uploaded digital scans of their photos to Google’s cloud storage—crucially, the fronts andbacks of the photos, since the backs have considerable additional contextual data—and then Google uses their Cloud Vision API and Cloud Natural Language API to extract information about each photo.

Unfortunately, we don’t have access to the data being produced from the Times/Google collaboration. Currently it is being used behind the scenes to locate photos for some nice slide shows on the history of New York City. But we can assume that the Times gets considerable benefit from the computational processing that is happening. As Google’s marketing team emphasizes, “the digital text transcription isn’t perfect, but it’s faster and more cost effective than alternatives for processing millions of images.”

Unspoken here are the “alternatives,” by which they clearly mean processes involving humans and human labor. These new AI/ML techniques may not be perfect (yet? no, likely ever, see “Post-Impressionism”), but they have a strong claim to the “perfect is the enemy of the good” school of cultural heritage processing. There’s a not-so-subtle nudge from Google: Hey archivists, you wanna process millions of photos by hand, with dozens of people you have to hire writing descriptions of what’s in the photos and transcribing the backs of them too? Or do you want it done tomorrow by our giant computers?

As Clifford Lynch writes in a forthcoming essay (which I will link to once it’s published), “[Machine learning applications] will substantially improve the ability to process and provide access to digital collections, which has historically been greatly limited by a shortage of expert human labor. But this will be at the cost of accepting quality and consistency that will often fall short of what human experts have been able to offer when available.”

This problem is very much in the forefront of my mind, as the Northeastern University Library recently acquired the morgue of the Boston Globe, which contains the original prints of over one million photographs that were used in the paper, as well as over five million negatives, most of which have never been seen beyond the photo editing room at the Globe. It’s incredibly exciting to think about digitizing, making searchable, and presenting this material in new ways—we have the potential to be a publicly accessible version of the NYT/Google collaboration.

But we also face the difficult issue of enormous scale. It’s a vast collection. Along with the rest of the morgue, which includes thousands of boxes of clippings, topical files, and much else, we have filled a significant portion of the archives and special collections storage in Snell Library.

Fortunately the negative sleeves have some helpful descriptive metadata, which could be transcribed by humans fairly readily and applied to the 20-40 photos in each envelope. But what’s really going on in each negative, in detail? Who or what appears? That’s a hard and expensive problem. (Insert GIF of Larry Page slowly turning toward the camera and laughing like a Bond villain.)

I have started to test several machine vision APIs, and it’s interesting to note the different approaches and strengths of each service. Here’s a scan of a negative from our Globe archive, of a protest during the upheaval of Boston’s school desegregation and busing era, uploaded to Google’s Cloud Vision API (top) and Microsoft’s Computer Vision API (bottom).

I’ll return to these tests in a future issue of HI, as I am still experimenting, but some impressive things are happening here, with multiple tabs showing different analyzes of the photograph. Both services are able to identify common entities and objects and any text that appears. They also do a good job analyzing the photo as a set of pixels—its tones and composition, which can be helpful if, say, we don’t want to spend time examining a washed out or blurry shot.

There are also creepy things happening here, as each service has a special set of tools around faces that appear. As Lynch notes, “The appropriateness of applying facial recognition [to library materials] will be a subject of major debate in coming years; this is already a very real issue in the social media context, and it will spread to archives and special collections.”

Google, in a way that is especially eyebrow-raising, also connects its Cloud Vision API to its web database, and so was able to figure out the historical context of this test photograph rather precisely (shown in the screen shot, above). Microsoft synthesizes all of the objects it can identify into a pretty good stab at a caption: “A man holding a sign walking in a street.” For those who want to make their collections roughly searchable (and just as important, provide accessibility for those with visual impairments through good-enough textual equivalents for images), a quick caption service like this is attractive. And it assigns that caption a probability: a very confident score, in this case, of 96.9%.

In the spirit of Humane Ingenuity, we should recognize that this is not an either/or situation, a binary choice between human specialists and vision bots. We can easily imagine, for instance, a “human in the loop” scenario in which the automata provide educated guesses and a professionals in libraries, archives, and museums provide knowledgeable assessment, and make the final choices of descriptions to use and display to the public. Humans can also choose to strip the data of facial recognition or other forms of identity and personal information, based on ethics, not machine logic.

In short, if we are going to let AI loose in the archive, we need to start working on processes and technological/human integrations that are the most advantageous combinations of bots and brains. And this holds true outside of the archive as well.


Neural Neighbors

The numerical scores produced by the computer can also relate objects in vast collections in ways that help humans do visual rather than textual searches. Doug Duhaime, Monica Ong Reed, and Peter Leonard of Yale’s Digital Humanities Lab ran 27,000 images from the nineteenth-century Meserve-Kunhardt Collection (one of the largest collections of 19th-century photography) at the Beinecke Rare Book and Manuscript Library through a neural network, which then clustered the images into categories through 2,048 ways of “seeing.” The math and database could then be used to produce a map of the collection as a set of continents and archipelagos of similar and distant photographs.

Through this new interface, you can zoom into one section (at about 100x, below) to find all of the photographs of boxers in a cluster.

The Neural Neighbors project reveals the similarity scores in another prototype search tool. As David Leonard noted in a presentation (PDF) of this work at a meeting of the Coalition for Networked Information, much of the action is happening in the penultimate stage of the process, when those scores emerge. And that is also where an archivist can step in and take the reins back from the computer, to transform the mathematics into more rigorous categories of description, or to dream up a new user interface for accessing the collection.

(See also: John Resig’s pioneering work in this area: “Ukiyo-e.org: Aggregating and Analyzing Digitized Japanese Woodblock Prints.”)


Some Fun with Image Pathways

Liz Gallagher, a data scientist at the Wellcome Trust, recently used similar methods to the Neural Neighbors project on the Welcome’s large collection of digitized works related to health and medical history.

Then, using Google’s X Degrees of Separation code, she created pathways of hidden connections between dissimilar images, tracing a set of hops between more closely related images to get there one step at a time, from left to right in each row.

Each adjacent image is fairly similar according to the computer, but the ends of the rows are not. And some of the hops, especially in the middle, are fairly amusing and even a bit poetic?

Humane Ingenuity 2: The Soul of a New Machine + Auditing Algorithms

Today Apple will release new iPhones and other gizmos and services, and as they do every year, the tech pundits will ask: “Does this live up to the expectations and vision of Steve Jobs?” I, on the other hand, will ask: “Does this live up to the expectations and vision of Jef Raskin?” Apple likes to imagine itself as the humane tech company, with its emphasis on privacy and a superior user experience, but the origins of that humaneness—if it still exists beyond marketing—can be traced less to the ruthless Jobs than to the gentler Raskin. Jobs may have famously compared a computer to a “bicycle for the mind,” but Raskin articulated more genuinely a desire that computers be humane and helpful instruments.

To be clear, Raskin, like Jobs, wanted to sell millions of personal computers, but only Raskin worried aloud about what would happen if that seemingly ridiculous goal was achieved: “Will the average person’s circle of acquaintances grow? Will we be better informed? Will a use of these computers as an entertainment medium become their primary value? Will they foster self-education? Is the designer of a personal computer system doing good or evil?” It is remarkable to read these words in an internal computer design document from 1980, but such reflections were common in Raskin’s writing, and clearly more heartfelt than Google’s public, thin, and short-lived “Don’t be evil” motto.

Jobs may have dabbled in calligraphy and obsessed over design, but Raskin was the polymath who truly lived at the intersection of the liberal arts and technology. In addition to physics, math, and computer science, Raskin studied philosophy, music (which he also composed and performed at a professional level), and visual arts (he was also an accomplished artist). He clearly read a lot, which was reflected in his clear and often mirthful writing style, flecked with nerdy guffaws. (The end of one of his long Apple memos: “Summery: That means fair, warm weather, just after spring.”) He wrote a book on user interface design called The Humane Interface and sought to build a new computing system called The Humane Environment. For the purposes of this newsletter, and for some ongoing conversations I would like to have with you about the ethical dimension of technological creation, he is one important touchstone.

(Jef Raskin with a model of the Canon Cat, photo by Aza Raskin)

Raskin was one of the earliest Apple employees, hired to direct their publications and documentation, and is widely known for leading the early Macintosh project, before Jobs swooped in and recast it in his (and Xerox PARC’s) image. But before that happened, Raskin, as the consummate documenter, got to lay out the founding principles of the Mac. This set of documents became known—in a quasi-religious way—as the Book of Macintosh.

That “book” (really, a collection of documents) is now in the Special Collections at Stanford University, and they have made some of it available online if you would like to read them at the next Apple high holiday. Within these pages, you can witness Raskin pondering computational devices and user experiences that would become gospel within Apple. “This should be a completely self-teaching system.” “If this is to be truly a product for the home, shouldn’t we offer it in various colors?” “Telecommunications will become a key part of every computer market segment.” “The computer must be in one lump.” (Note to Apple: Raskin would have hated the proliferation of dongles.)

In one especially cogent document, Raskin summarized the philosophy of the Mac: “Design Considerations for an Anthropophilic Computer,” an oddly latinate title considering that Raskin dropped the second F in his first name because he considered it superfluous. Raskin imagines what the computing of the future should look like, once it moves beyond the hobbyists of the 1970s and into the mainstream:

This is an outline for a computer designed for the Person In The Street (or, to abbreviate: the PITS); one that will be truly pleasant to use, that will require the user to do nothing that will threaten his or her perverse delight in being able to say: “I don’t know the first thing about computers,” and one which will be profitable to sell, service and provide software for.

You might think that any number of computers have been designed with these criteria in mind, but not so. Any system which requires a user to ever see the interior, for any reason, does not meet these specifications. There must not be additional ROMS, RAMS, boards or accessories except those that can be understood by the PITS as a separate appliance. For example, an auxiliary printer can be sold, but a parallel interface cannot. As a rule of thumb, if an item does not stand on a table by itself, and if it does not have its own case, or if it does not look like a complete consumer item in and of itself, then it is taboo.

If the computer must be opened for any reason other than repair (for which our prospective user must be assumed incompetent) even at the dealer’s, then it does not meet our requirements.

Seeing the guts is taboo. Things in sockets is taboo (unless to make servicing cheaper without imposing too large an initial cost). Billions of keys on the keyboard is taboo. Computerese is taboo. Large manuals, or many of them (large manuals are a sure sign of bad design) is taboo. Self-instructional programs are NOT taboo.

There must not be a plethora of configurations. It is better to offer a variety of case colors than to have variable amounts of memory. It is better to manufacture versions in Early American, Contemporary, and Louis XIV than to have any external wires beyond a power cord.

And you get ten points if you can eliminate the power cord.

As I’ve argued elsewhere, I think the iPad, not the Mac, came closest to what Raskin was dreaming of here, although I suspect that as a text-lover, and given his other writing on user interfaces, he would have preferred an iPad that was oriented more toward writing and communication than consumption. But Raskin’s deep sense of how most people don’t have time to fidget with software or hardware—who just want the damn computer to work, in an understandable and consistent way—was ahead of its time. Most people are busy and tired and don’t want to be hobbyists with their digital devices.

Unfortunately, latent in Raskin’s understanding is a dark upside down world, the flip side to designing computer environments for the PITS. When you grasp that people don’t have time to fiddle with bits, when you start focusing of the software of the mind—the psychology of the user—rather than the hardware of the computer, the temptation emerges to design platforms where ease of use acts to lock people in, or perform social experiments on them. Those who are busy and tired and don’t have time to tinker—that is, most of us—may also prefer using Facebook to maintaining a personal blog or website. And that’s one way our computers became more misanthropic than anthropophilic.


What can we do when these platforms turn against us after drawing us in? On the opening podcast of the third season of What’s New, I talk to Christo Wilson, who is part of a team at Northeastern University that “audits” the algorithms within the black boxes of Facebook, Google, Amazon, and other monolithic internet services that dominate our world. There has been some very good writing recently about how these algorithms have gone awry—I recommend Cathy O’Neil’s Weapons of Math Destruction and Safiya Noble’s Algorithms of Oppression—and Christo and his colleagues have established rigorous methods for testing these services from the outside to identify their attributes and flaws. They also are able to provide you, the user of Facebook, Google, and Amazon, an understanding of exactly which of your personal attributes these services are using to customize your online environment (and track you). Scary but important work. Tune in.

Humane Ingenuity 1: The Big Reveal

An increasing array of cutting-edge, often computationally intensive methods can now reveal formerly hidden texts, images, and material culture from centuries ago, and make those documents available for search, discovery, and analysis. Note how in the following four case studies, the emphasis is on the human; the futuristic technology is remarkable, but it is squarely focused on helping us understand human culture better.


Gothic Lasers

If you look very closely, you can see that the stone ribs in these two vaults in Wells Cathedral are slightly different, even though they were supposed to be identical. Alexandrina Buchanan and Nicholas Webb noticed this too and wanted to know what it said about the creativity and input of the craftsmen into the design: how much latitude did they have to vary elements from the architectural plans, when were those decisions made, and by whom? Before construction or during it, or even on the spur of the moment, as the ribs were carved and converged on the ceiling? How can we recapture a decent sense of how people worked and thought from inert physical objects? What was the balance between the pursuit of idealized forms, and practical, seat-of-the-pants tinkering?

In “Creativity in Three Dimensions: An Investigation of the Presbytery Aisles of Wells Cathedral,” they decided to find out by measuring each piece of stone much more carefully than can be done with the human eye. Prior scholarship on the cathedral—and the question of the creative latitude and ability of medieval stone craftsmen—had used 2-D drawings, which were not granular enough to reveal how each piece of the cathedral was shaped by hand to fit, or to slightly shape-shift, into the final pattern. High-resolution 3-D scans using a laser revealed so much more about the cathedral—and those who constructed it, because individual decisions and their sequence became far clearer.

Although the article gets technical at moments (both with respect to the 3-D laser and computer modeling process, and with respect to medieval philosophy and architectural terms), it’s worth reading to see how Buchanan and Webb reach their affirming, humanistic conclusion:

The geometrical experimentation involved was largely contingent on measurements derived from the existing structure and the Wells vaults show no interest in ideal forms (except, perhaps in the five-point arches). We have so far found no evidence of so-called “Platonic” geometry, nor use of proportional formulae such as the ad quadratum and ad triangulatum principles. Use of the “four known elements” rule evidenced masons’ “cunning”, but did not involve anything more than manipulation and measurement using dividers rather than a calibrated ruler and none of the processes used required even the simplest mathematics. The designs and plans are based on practical ingenuity rather than theoretical knowledge.


Hard OCR

Last year at the Northeastern University Library we hosted a meeting on “hard OCR”—that is, physical texts that are currently very difficult to convert into digital texts using optical character recognition (OCR), a process that involves rapidly improving techniques like computer vision and machine learning. Representatives from libraries and archives, technology companies that have emerging AI tech (such as Google), and scholars with deep subject and language expertise all gathered to talk about how we could make progress in this area. (This meeting and the overall project by Ryan Cordell and David Smith of Northeastern’s NULab for Texts, Maps, and Networks, “A Research Agenda for Historical and Multilingual Optical Character Recognition,” was generously funded by the Andrew W. Mellon Foundation.)

OCRing modern printed books has become if not a solved problem at least incredibly good—the best OCR software gets a character right in these textual conversions 99% of the time. But older printed books, ancient and medieval written works, writing outside of the Romance languages (e.g., in Arabic, Sanskrit, or Chinese), rare languages (such as Cherokee, with its unique 85-character alphabet, which I covered on the What’s New podcast), and handwritten documents of any kind, remain extremely challenging, with success rates often below 80%, and in some cases as low as 40%. That means 1-3 characters are mistakenly translated by the computer in a five-character word. Not good at all.

The meeting began to imagine a promising union of language expertise from scholars in the humanities and the most advanced technology for “reading” digital images. If the computer (which in the modern case, really means an immensely powerful cloud of thousands of computers) has some ground-truth texts to work from—say, a few thousand documents in their original form and a parallel machine-readable version of those same texts, painstakingly created by a subject/language expert—then a machine-learning algorithm can be created to interpret with much greater accuracy new texts in that language or from that era. In other words, if you have 10,000 medieval manuscript pages perfectly rendered in XML, you can train a computer to give you a reasonably effective OCR tool for the next 1,000,000 pages.

Transkribus is one of the tools that works in just this fashion, and it has been used to transcribe 1,000 years of highly variant written works, in many languages, into machine-readable text. Thanks to the monks of the Hilandar Monastery, who kindly shared their medieval manuscripts, Quinn Dombrowski, a digital humanities scholar with a specialty in medieval Slavic texts, trained Transkribus in handwritten Cyrillic manuscripts, and calls the latest results from the tool “truly nothing short of miraculous.”


X-Manuscripts

Lisa Davis Fagin is the Executive Director of the Medieval Academy of America and her excellent blog, Manuscript Road Trip, is highly recommended. In a recent post, she explores the helpful use of X-ray florescence on an unusual Book of Hours.

It’s interesting to see the interplay between the intuition of scholars—this looks off in some way—and the data generated by the scientific instruments. Of course, things are not what they appear.


Really Hard OCR

Imagine trying to read an ancient text that was written in black ink on a scroll that was then roasted to a uniform, black crisp, and made so brittle it can never be unrolled. That’s what happened when Mount Vesuvius dumped twenty meters of lava on Herculaneum and flash-charred their libraries. These scrolls contain huge amounts of text that scholars are eager to read and that would undoubtedly greatly expand our understanding of ancient Rome and the Mediterranean region. But again: these texts look like the worst burnt burrito you’ve ever seen.

Enter the Digital Restoration Initiative, which has been developing a way to scan and virtually unwrap these blackened scrolls, and then extract the text from them so we can read what people wrote two thousand years ago. They pioneered this technique on the En-Gedi scroll (shown to the right of the penny, below) to computationally produce a flattened, readable text (to the left of the penny).

DRI is now working on the Herculaneum scrolls, and you can watch their techniques and tools, and the sheer complexity of the process, in this recent video:

These ancient texts, encased in a protective layer of hardened lava, look not unlike femurs, and their sectional scans really look like a CAT scan of a human bone. And that’s kind of beautiful, no?

Humane Ingenuity: My New Newsletter

With the start of this academic year, I’m launching a new newsletter to explore technology that helps rather than hurts human understanding, and human understanding that helps us create better technology. It’s called Humane Ingenuity, and you can subscribe here. (It’s free, just drop your email address into that link.)

Subscribers to this blog know that it has largely focused on digital humanities. I’ll keep posting about that, and the newsletter will have significant digital humanities content, but I’m also seeking to broaden the scope and tackle some bigger issues that I’ve been thinking about recently (such as in my post on “Robin Sloan’s Fusion of Technology and Humanity“). And I’m hoping that the format of the newsletter, including input from the newsletter’s readers, can help shape these important discussions.

Here’s the first half of the first issue of Humane Ingenuity. I hope you’ll subscribe to catch the second half and all forthcoming issues.


Humane Ingenuity #1: The Big Reveal

An increasing array of cutting-edge, often computationally intensive methods can now reveal formerly hidden texts, images, and material culture from centuries ago, and make those documents available for search, discovery, and analysis. Note how in the following four case studies the emphasis is on the human; the futuristic technology is remarkable, but it is squarely focused on helping us understand human culture better.


Gothic Lasers

If you look very closely, you can see that the stone ribs in these two vaults in Wells Cathedral are slightly different, even though they were supposed to be identical. Alexandrina Buchanan and Nicholas Webb noticed this too and wanted to know what it said about the creativity and input of the craftsmen into the design: how much latitude did they have to vary elements from the architectural plans, when were those decisions made, and by whom? Before construction or during it, or even on the spur of the moment, as the ribs were carved and converged on the ceiling? How can we recapture a decent sense of how people worked and thought from inert physical objects? What was the balance between the pursuit of idealized forms, and practical, seat-of-the-pants tinkering?

In “Creativity in Three Dimensions: An Investigation of the Presbytery Aisles of Wells Cathedral,” they decided to find out by measuring each piece of stone much more carefully than can be done with the human eye. Prior scholarship on the cathedral—and the question of the creative latitude and ability of medieval stone craftsmen—had used 2-D drawings, which were not granular enough to reveal how each piece of the cathedral was shaped by hand to fit, or to slightly shape-shift, into the final pattern. High-resolution 3-D scans using a laser revealed so much more about the cathedral—and those who constructed it, because individual decisions and their sequence became far clearer.

Although the article gets technical at moments (both with respect to the 3-D laser and computer modeling process, and with respect to medieval philosophy and architectural terms), it’s worth reading to see how Buchanan and Webb reach their affirming, humanistic conclusion:

The geometrical experimentation involved was largely contingent on measurements derived from the existing structure and the Wells vaults show no interest in ideal forms (except, perhaps in the five-point arches). We have so far found no evidence of so-called “Platonic” geometry, nor use of proportional formulae such as the ad quadratum and ad triangulatum principles. Use of the “four known elements” rule evidenced masons’ “cunning”, but did not involve anything more than manipulation and measurement using dividers rather than a calibrated ruler and none of the processes used required even the simplest mathematics. The designs and plans are based on practical ingenuity rather than theoretical knowledge.


Hard OCR

Last year at the Northeastern University Library we hosted a meeting on “hard OCR”—that is, physical texts that are currently very difficult to convert into digital texts using optical character recognition (OCR), a process that involves rapidly improving techniques like computer vision and machine learning. Representatives from libraries and archives, technology companies that have emerging AI tech (such as Google), and scholars with deep subject and language expertise all gathered to talk about how we could make progress in this area. (This meeting and the overall project by Ryan Cordell and David Smith of Northeastern’s NULab for Texts, Maps, and Networks, “A Research Agenda for Historical and Multilingual Optical Character Recognition,” was generously funded by the Andrew W. Mellon Foundation.)

OCRing modern printed books has become if not a solved problem at least incredibly good—the best OCR software gets a character right in these textual conversions 99% of the time. But older printed books, ancient and medieval written works, writing outside of the Romance languages (e.g., in Arabic, Sanskrit, or Chinese), rare languages (such as Cherokee, with its unique 85-character alphabet, which I covered on the What’s New podcast), and handwritten documents of any kind, remain extremely challenging, with success rates often below 80%, and in some cases as low as 40%. That means 1-3 characters are mistakenly translated by the computer in a five-character word. Not good at all.

The meeting began to imagine a promising union of language expertise from scholars in the humanities and the most advanced technology for “reading” digital images. If the computer (which in the modern case, really means an immensely powerful cloud of thousands of computers) has some ground-truth texts to work from—say, a few thousand documents in their original form and a parallel machine-readable version of those same texts, painstakingly created by a subject/language expert—then a machine-learning algorithm can be created to interpret with much greater accuracy new texts in that language or from that era. In other words, if you have 10,000 medieval manuscript pages perfectly rendered in XML, you can train a computer to give you a reasonably effective OCR tool for the next 1,000,000 pages.

Transkribus is one of the tools that works in just this fashion, and it has been used to transcribe 1,000 years of highly variant written works, in many languages, into machine-readable text. Thanks to the monks of the Hilandar Monastery, who kindly shared their medieval manuscripts, Quinn Dombrowski, a digital humanities scholar with a specialty in medieval Slavic texts, trained Transkribus in handwritten Cyrillic manuscripts, and calls the latest results from the tool “truly nothing short of miraculous.”

[Again, you can subscribe to Humane Ingenuity to receive the full first issue right here. Thanks.]

Engagement Is the Enemy of Serendipity

Whenever I’m grumpy about an update to a technology I use, I try to perform a self-audit examining why I’m unhappy about this change. It’s a helpful exercise since we are all by nature resistant to even minor alterations to the technologies we use every day (which is why website redesign is now a synonym for bare-knuckle boxing), and this feeling only increases with age. Sometimes the grumpiness is justified, since one of your tools has become duller or less useful in a way you can clearly articulate; other times, well, welcome to middle age.

The New York Times recently changed their iPad app to emphasize three main tabs, Top Stories, For You, and Sections. The first is the app version of their chockablock website home page, which contains not only the main headlines and breaking news stories, but also an editor-picked mixture of stories and features from across the paper. For You is a new personalized zone that is algorithmically generated by looking at the stories and sections you have most frequently visited, or that you select to include by clicking on blue buttons that appear near specific columns and topics. The last tab is Sections, that holdover word from the print newspaper, with distinct parts that are folded and nested within each other, such as Metro, Business, Arts, and Sports.

Currently my For You tab looks as if it was designed for a hypochondriacal runner who wishes to live in outer space, but not too far away, since he still needs to acquire new books and follow the Red Sox. I shall not comment about the success of the New York Times algorithm here, other than to say that I almost never visit the For You tab, for reasons I will explain shortly. For now, suffice it to say that For You is not for me.

But the Sections tab I do visit, every day, and this is the real source of my grumpiness. At the same time that the New York Times launched those three premier tabs, they also removed the ability to swipe, simply and quickly, between sections of the newspaper. You used to be able to start your morning news consumption with the headlines and then browse through articles in different sections from left to right. Now you have to tap on Sections, which reveals a menu, from which you select another section, from which you select an article, over and over. It’s like going back to the table of contents every time you finish a chapter of a book, rather than just turning the page to the next chapter.

Sure, it seems relatively minor, and I suspect the change was made because confused people would accidentally swipe between sections, but paired with For You it subtly but firmly discourages the encounter with many of the newspaper’s sections. The assumption in this design is that if you’re a space runner, why would you want to slog through the International news section or the Arts section on the way to orbital bliss in the Science and Health sections?

* * *

When I was growing up in Boston, my first newspaper love was the sports section of the Boston Globe. I would get the paper in the morning and pull out that section and read it from cover to cover, all of the columns and game summaries and box scores. Somewhere along the way, I started briefly checking out adjacent sections, Metro and Business and Arts, and then the front section itself, with the latest news of the day and reports from around the country and world. The technology and design of the paper encouraged this sampling, as the unpacked paper was literally scattered in front of me on the table. Were many of these stories and columns boring to my young self? Undoubtedly. But for some reason—the same reason many of those reading this post will recognize—I slowly ended up paging through the whole thing from cover to cover, still focusing on the Sox, but diving into stories from various sections and broadly getting a sense of numerous fields and pursuits.

This kind of interface and user experience is now threatened because who needs to scan through seemingly irrelevant items when you can have constant go-go engagement, that holy grail of digital media. The Times, likely recognizing their analog past (which is still the present for a dwindling number of print subscribers), tries to replicate some of the old newspaper serendipity with Top Stories, which is more like A Bunch of Interesting Things after the top headlines. But I fear they have contradicted themselves in this new promotion of For You and the commensurate demotion of Sections.

The engagement of For You—which joins the countless For Yous that now dominate our online media landscape—is the enemy of serendipity, which is the chance encounter that leads to a longer, richer interaction with a topic or idea. It’s the way that a metalhead bumps into opera in a record store, or how a young kid becomes interested in history because of the book reviews that follow the box scores. It’s the way that a course taken on a whim in college leads, unexpectedly, to a new lifelong pursuit. Engagement isn’t a form of serendipity through algorithmically personalized feeds; it’s the repeated satisfaction of Present You with your myopically current loves and interests, at the expense of Future You, who will want new curiosities, hobbies, and experiences.