Presidential Libraries and the Digitization of Our Lives

Buried in the recent debates (New York Times, Chicago Tribune, The Public Historian) about the nature, objectives, and location of the Obama Presidential Center is the inexorable move toward a world in which virtually all of the documentation about our lives is digital.

To make this decades-long shift—now almost complete—clear, I made the following infographic comparing three representative presidential libraries, each a generation apart: LBJ’s, Bill Clinton’s, and Barack Obama’s. Each square represents the relative overall size of these presidential archives—roughly 46 million pages for LBJ, 100 million for Clinton, and 360 million for Obama—as well as the basic categories of archival material: paper documents, photographs and audiovisual media, and, starting with Clinton, email.

A small square that is mostly orange, representing the dominance of paper documents in LBJ's administration.
LBJ Presidential Library
A medium size square that is three-quarters orange, representing paper documents in the Clinton White House, and roughly one-quarter blue for email.
Clinton Presidential Library
A giant square that is almost entirely blue, representing the prevalence of email in the Obama administration.
Obama Presidential Library

The LBJ Presidential Library has 45 million pages of paper documents and a million photographs, recordings, and other media. The Clinton Presidential Library contains 78 million pages of documents, 20 million emails, 2 million photographs, and 12,500 videotapes. (Note that contrary to all of the recent coverage of Obama as “the first digital president,” given his administration’s rapid adoption of email in the 1990s, Clinton really should hold that title, as I’ve discussed elsewhere.)

We are still in the process of assessing all that will go into the Obama Presidential Library (other libraries have added considerable new caches of documents over time), but the rough initial count from the U.S. National Archives and Records Administration is that there are about 300 million emails from Obama’s eight years in the White House, and about 30 million pages of paper documents. The chart above would be even more email-centric for Obama’s library if I used NARA’s calculation of a few paper pages per email, which would equal over a billion pages in printed form. In other words, using a more rigorous comparison at best only 3% of the Obama record is print vs. digital.

More vaguely estimated above are the millions of “pages” associated with the many other digital forms the Obama administration used, including websites, apps, and social media (you can already download the entirety of the latter as .zip files here). Most of the photos (many of which were uploaded to Flickr) and videos were of course also born digital. (Update, 3/11/19: The Obama Foundation came out with a new fact sheet that says that “an estimated 95 percent of the Obama Presidential Records were created digitally and have no paper equivalents. It also says that there are roughly 1.5 billion pages in the collection, including everything I’ve detailed here.)

It’s unfortunate that it’s still relatively expensive and time-consuming to digitize analog materials. Nearly two decades on, the Clinton Presidential Library has only digitized about 1% of their paper holdings (about 700,000 pages). The Reagan Presidential Library charges $.80 to digitize one page of his archives. The Obama Presidential Center’s commitment to funding the complete digitization of those 30 million paper pages, in what seems like a more rapid fashion and with open access to the public, seems rather laudable in this context.

Ultimately, I suppose it’s best to say that Obama was “the first almost fully digital president,” and with the digitization of the remaining paper record, will become “the first fully machine-readable and -indexed president.” (Part of the debate in academic and library circles about this shift in the Obama Presidential Center/Library has to do with the role of archivists and historians to create good metadata for, and more thorough searches through, administration documents, but with a billion+ pages, I don’t see how this can be done without serious computational means.)

Meanwhile, all of us have more quietly followed the same path, with only a very small percentage of our overall record now existing in physical formats rather than bits. How we will preserve this heterogeneous and perhaps ephemeral digital record when we don’t have our own presidential libraries and the resources of NARA is a different and more worrisome story.