Books, Digitization, Google, Text Mining

Google Book Search Blog

For those interested in the Google book digitization project (one of my three copyright-related stories to watch for 2006), Google launched an official blog yesterday. Right now “Inside Google Book Search” seems more like “Outside Google Book Search,” with a first post celebrating the joys of books and discovery, and with a set of links lauding the project, touting “success stories,” and soliciting participation from librarians, authors, and publishers. Hopefully we’ll get more useful insider information about the progress of the project, hints about new ways of searching millions of books, and other helpful tips for scholars in the near future. As I recently wrote in an article in D-Lib Magazine, Google’s project has some serious—perhaps fatal—flaws for those in the digital humanities (not so for the competing, but much smaller, Open Content Alliance). In particular, it would be nice to have more open access to the text (rather than mere page images) of pre-1923 books (i.e., those that are out of copyright). Of course, I’m a historian of the Victorian era who wants to scan thousands of nineteenth-century books using my own digital tools, not a giant company that may want to protect its very expensive investment in digitizing whole libraries.

Standard

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s