First Impressions of the Google Books Settlement

Just announced is the settlement of the class action lawsuit that the Authors Guild, the Association of American Publishers and individual authors and publishers filed against Google for its Book Search program, which has been digitizing millions of books from libraries. (Hard to believe, but the lawsuit was first covered on this blog all the way back in November 2005.) Undoubtedly this agreement is a critical one not only for Google and the authors and publishers, but for all of us in academia and others who care about the present and future of learning and scholarship.

It will obviously take some time to digest this agreement; indeed, the Google post on it is fairly sketchy and we still need to hear details, such as the cost structure for full access the agreement now provides for. But my first impressions of some key points:

The agreement really focuses on in-copyright but out-of-print books. That is, books that can’t normally be copied but also can’t be purchased anywhere. Highlighting these books (which are numerous; most academic books, e.g., are out-of-print and have virtually no market) was smart for Google since it seems to provide value without stepping on publishers’ toes.

A second (also smart, but probably more controversial) focus is on access to the Google Books collection via libraries:

We’ll also be offering libraries, universities and other organizations the ability to purchase institutional subscriptions, which will give users access to the complete text of millions of titles while compensating authors and publishers for the service. Students and researchers will have access to an electronic library that combines the collections from many of the top universities across the country. Public and university libraries in the U.S. will also be able to offer terminals where readers can access the full text of millions of out-of-print books for free.

Again, we need to hear more details about this part of the agreement. We also need to begin thinking about how this will impact libraries, e.g., in terms of their own book acquisition plans and their subscriptions to other online databases.

Finally, and perhaps most interesting and surprising to those of us in the digital humanities, is an all-too-brief mention of computational access to these millions of books:

In addition to the institutional subscriptions and the free public access terminals, the agreement also creates opportunities for researchers to study the millions of volumes in the Book Search index. Academics will be able to apply through an institution to run computational queries through the index without actually reading individual books.

For years in this space I have been arguing for the necessity of such access (first envisioned, to give due credit, by Cliff Lynch of CNI). Inside Google they have methods for querying and analyzing these books that we academics could greatly benefit from, and that could enable new kinds of digital scholarship.

Update: The Association of American Publishers now has a page answering frequently asked questions about the agreement (have we had time to ask?).


[…] Google just announced that it will be offering libraries subscription access to out-of-print (but still under copyright) books which it has […]

[…] Dan Cohen, Center for History and new media, George Mason University (via J.A. Furtado) […]

[…] Cohen at the Center for History and New Media blogs: First Impressions of the Google Books Settlement. Possibly related posts: (automatically generated)Google Print and Fair UseGoogle Settles Suit Over […]

[…] battle with copyright laws and privacy issues, and appears to be a good compromise. Dan Cohen has a more skeptical take. Your thoughts? Related Nerdlets […]

[…] to follow. In the meantime, see Dan Cohen’s “First Impressions of the Google Books Settlement,” which hits the high […]

[…] 4) Dan Cohen, First Impressions of the Google Books Settlement. […]

[…] 4) Dan Cohen, First Impressions of the Google Books Settlement. […]

bowerbird says:

yeah, sure, google’s gonna give people the ability to
run analyses against its database of all those books,
sure it is… i’ve become convinced this is a sell-out,
designed to erect a huge barrier to entry by others…


Jim Carlile says:

It’s an interesting idea, but one they’ve long been planning. If you look at their UC agreement, it’s obvious that this was their business model all along:

‘4.3… Google agrees that to the extent that it or its successors use any Digitized Selected Content in connection with any Google Services, it shall provide a service at no cost to End Users (1) for both search and display of search results and (2) for access to the display of the full text of public domain works contained in the Digitized Selected Content. To the extent portions of the Google Digital Copy are either In the public domain or where Google has otherwise obtained authorization, Google shall have the right, in its sole discretion, among other things, to (a) index the full text or content, (b) serve and display full-sized digital images corresponding to those portions, (c) make available full text of content for printing and/or download, and (d) make copies of such portions of the Google Digital Copy and provide, license, or sell such copies (including, without limitation, to its syndication partners). For all other portions of the Google Digital Copy, Google may index the full text or content but may not serve or display the full-sized digital image or make available for printing, streaming and/or download the full content unless Google has permission or license from the copyright owner to do so; Google instead may serve and display (1) an excerpt that Google reasonably determines would constitute fair use under copyright law and (2) bibliographic (e.g., title, author, date, etc) and other non-copyrighted information….’

But notice that their UC agreement also requires them to make all public domain books freely available, which they are not always doing. Many pre-1964 books are not in copyright any longer- millions, in fact. Google is hoarding those books, in violation of at least this one library scanning agreement.

[…] the Digital Campus podcast triumphantly returns to the airwaves with a discussion of the recent Google Book Search settlement. Also up for analysis are Microsoft’s move to the cloud, the new Google phone, and, as […]

[…] Google’s case. At this juncture, it’s unclear how the recent Google Books settlement (see Dan Cohen’s analysis) or changes in PDF access through Google will impact the […]

[…] in history will never be complete until we include both articles and monographs. Now that the API for Google Books is being updated, we are on the verge of creating text mining algorithms that can scan the endnotes and […]

Leave a Reply