Zotero and the Internet Archive Join Forces

IA LogoZotero LogoI’m pleased to announce a major alliance between the Zotero project at the Center for History and New Media and the Internet Archive. It’s really a match made in heaven—a project to provide free and open source software and services for scholars joining together with the leading open library. The vision and support of the Andrew W. Mellon Foundation has made this possible, as they have made possible the major expansion of the Zotero project over the last year.

You will hear much more about this alliance in the coming months on this blog, but I wanted to outline five key elements of the project.

1. Exposing and Sharing the “Hidden Archive”

The Zotero-IA alliance will create a “Zotero Commons” into which scholarly materials can be added simply via the Zotero client. Almost every scholar and researcher has documents that they have scanned (some of which are in the public domain), finding aids they have created, or bibliographies on topics of interest. Currently there is no easy way to share these; giving them a central home at the Internet Archive will archive them permanently (before they are lost on personal hard drives) and make them broadly available to others.

We understand that not everyone will be willing to share everything (some may not be willing to share anything, even though almost every university commencement reminds graduates that they are joining a “community of scholars”), but we believe that the Commons will provide a good place for shareable materials to reside. The architectural historian with hundreds of photographs of buildings, the researcher who has scanned in old newspapers, and scholars who wish to publish materials in an open access environment will find this a helpful addition to Zotero and the Internet Archive. Some researchers may of course deposit materials only after finishing, say, a book project; what I have called “secondary scholarly materials” (e.g., bibliographies) will perhaps be more readily shared.

But we hope the second part of the project will further entice scholars to contribute important research materials to the Commons.

2. Searching the Personal Library

Most scholars have not yet figured out how to take full advantage of the digitized riches suddenly available on their computers. Indeed, the abundance of digital documents has actually exacerbated the problems of some researchers, who now find themselves overwhelmed by the sheer quantity of available material. Moreover, the major advantage of digital research—the ability to scan large masses of text quickly—is often unavailable to scholars who have done their own scanning or copying of texts.

A critical second part to this alliance of IA and Zotero is to bring robust and seamless Optical Character Recognition (OCR) to the vast majority of scholars who lack the means or do not know how to convert their scans into searchable text. In addition, this process will let others search through such newly digitized texts. After a submission to the Commons, the Internet Archive will subsequently return an OCRed version of each donated document to enable searchability. This text will be incorporated into the donor’s local index (on the Zotero client) and thus made searchable in Zotero’s powerful quick search and advanced search panes. In short, this process will provide a tremendous incentive for scholars to donate to the Commons, since it will help them with their own research.

3. Enabling Networked References and Annotations

One of the pillars of scholarship is the ability for distributed scholars to be sure they are referencing the same text or evidence. As noted in #1, one of the great advantages of the Zotero Commons at IA will be the transport of scholarly materials currently residing on personal hard drives to a public space with stable, rather than local, addresses. These addresses will become critical as scholars begin to use, refer to, and cite items in the Commons.

Yet the IA/Zotero partnership has another benefit: as scholars begin to use not only traditional primary sources that have been digitized but also “born digital” materials on the web (blogs, online essays, documents transcribed into HTML), the possibility arises for Zotero users to leverage the resources of IA to ensure a more reliable form of scholarly communication. One of the Internet Archive’s great strengths is that it has not only archived the web but also given each page a permanent URI that includes a time and date stamp in addition to the URL.

Currently when a scholar using Zotero wishes to save a web page for their research they simply store a local copy. For some, perhaps many, purposes this is fine. But for web documents that a scholar believes will be important to share, cite, or collaboratively annotate (e.g., among a group of coauthors of an article or book) we will provide a second option in the Zotero web save function to grab a permanent copy and URI from IA’s web archive. A scholar who shares this item in their library can then be sure that all others who choose to use it will be referring to the exact same document.

Moreover, unlike most research software the sophisticated annotation tools built into Zotero—the ability to highlight passages, add virtual Post-It notes, as well as regular notes on the overall document—maintain these annotations separately from the underlying document. This presents the exciting possibility for collaborative scholarly annotation of web pages.

4. Simplifying Collaborative Sharing

Groups of scholars also have the need to create more private “commons,” e.g., for documents that they would like to share in a limited way. In addition to the fully open Zotero Commons we will establish a mechanism for such restricted sharing. Via the Zotero Server, a user will be able to create a special collection with a distinct icon that shows up in the client interface (left column) for every member of the group.

Files added to these collections will be stored on the Internet Archive but will have restricted access. We believe that having these files reside on the IA server will encourage the donation of documents at the end of a collaborative project. The administrator of a shared collection will be able to move its contents into the fully open Zotero Commons via a single click in the administrative interface on the Zotero Server.

5. Facilitating Scholarly Discovery

The multiple libraries of content created by Zotero users and the multi-petabyte digital collections of the Internet Archive are resources that can potentially be of great use to the scholarly community. We believe that neither has experienced the level of exploration and usage we believe is possible through further development and collaboration.

The combined digital collections present opportunities for scholars to find primary research materials, to discover one another’s work, to identify materials that are already available in digital form and therefore do not need to be located and scanned, to find other scholars with similar interests and to share their own insights broadly. We plan to leverage the combined strengths of the Zotero project and the Internet Archive to work on better discovery tools.

Comments

Brett Bobley says:

Dan — This sounds like a wonderful project. Congratulations on this new alliance and I look forward to learning more about the next generation of Zotero.

Zotero is already a wonderful tool, and these added capabilities will help researchers a great deal.

What is the estimated time table for releasing these features?

[…] Edit: Dan Cohen has posted on his blog: Zotero and the Internet Archive Join Forces. […]

Bruce D'Arcus says:

Nice work Dan!

A question, though: exactly how will these URIs work in the context of Zotero? It seems the IA tracks the URI for the document proper, and then uses that in the context of its own URIs for the time-stamped copies.

So am I to understand that Zotero will keep a) the URI as the identifier, and b) a link to the proper time-stamped version archived at the IA?

[…] utviklar seg raskt, og blir stadig betre. Gjennom Zotero Commons-satsinga kan det sjå ut til at det vil bli utvikla ei sosial fildelingsteneste for […]

Dan Cohen says:

@Bruce: yes, I believe we will keep the URI (which is where the user is on the web when they ask Zotero to save a copy) but then (if the user specifies) link to a permanent copy at IA (of course, as you note, the IA URI includes the original address). The user will have a choice on this; if a group wants to, say, collaboratively annotate a document, they obviously will all need to be able to point to a stable cache and address. There are some details that need to be worked out on this, but the idea is to enable better citations and collaborative annotations, which are impossible if scholars are pointing to different versions of a web page.

Bruce D'Arcus says:

OK, good Dan. That approach gives the best of both worlds probably: integration with the distributed web, and the certainty and stability that can come from centralized archives.

[…] Kind of an interesting application of the Zotero add-on for Firefox : the Center for History and New Media at George Mason University (no relation and the Internet Archive are working to create a storage for scholarly annotation of online documents. […]

[…] recently announced Zotero / InternetArchive partnership is exciting on a bunch of levels. The one that immediately […]

[…] Internet Archive met nieuw wetenschappelijk project woensdag, 19 december 2007 Internet Archive gaat samenwerken met de Firefox plugin maker Zotero om wetenschappelijke gegevens op een efficiëntere manier in haar databank te krijgen. Lees meer … […]

[…] Zotero and the Internet Archive Join Forces December 19, 2007 The Zotero-IA alliance will create a “Zotero Commons” into which scholarly materials can be added simply via the Firefox plugin Zotero. Read more … […]

Matthew Treskon says:

It sounds like it could be a great resource, though I am a bit skeptical. Based on the stated workflow and infrastructure, it appears as if there is the potential for widespread copyright infringement. What are the Commons’ plans for preventing copyright infringement?

Erik Ringmar says:

This is very good and very amazing news. I’m doing research on European imperialism in China in the 19th century and here in Taiwan not all the material I need is readily available. But thanks to the Internet Archive I now have access to hundreds of amazing original titles. Thanks to Zotero they are all neatly organized, retrievable and searcheable. It’s more than time saving and efficient, it’s funky and fun. It’s like the real era of research just has begun. Everything else before was just warming up exercises.

Erik Ringmar says:

Yes, I should have said: of course I’ll upload all the material I produce (including the book itself once it’s finished). Little by little we’ll liberate the sources from their dusty shelves.

[…] informacji także na blogu Daniela Cohana Zotero and the Internet Archive Join Forces polecamy Janusz Tazbir, Polska przedmurzem EuropyDla czytelników serwisu “Historia i Media” […]

This sounds like a scholars heaven. I wish them all good luck in finding and doing what they want and need to do. Thank you,lcl123

[…] you haven’t already heard, some exciting news (and two clarifications) from Dan Cohen: I’m pleased to announce a major alliance between the […]

Joe Raben says:

An article in The New York Times for December 23 (www.nytimes.com/2007/12/23/business/media/23steal.html) describes the problems being created in the movie industry by the shift from film to digital recording. Whereas film can be stored (apparently with long time limits) in climate-controlled caves, storing digital movies can cost about 12 times as much, and adding the associated materials (like annotated scripts) can add about 400 times the costs of storing the same materials when they are associated with a conventional film. Added to the problem of costs are those of degradation when digital movies are transferred to film for storage, of superseded (and therefore unavailable) playback devices, and rapid deterioration. This situation in a global industry raises questions about the smaller, but we hope equally important, industry of converting humanities materials to digital formats.

It is some time since we seem to have concerned ourselves about long-term storage, and that discussion seems to have centered on the viability of CDs. Since the major drive to digitize the contents of whole libraries, there has not (to my awareness) been a similar expression of concern about whether the digital versions will outlive the books they are replacing.

Does anyone have information about this serious problem? Is it being considered by the major corporations that are engaged in the digital library initiatives? Where does one learn more about this basic concern of computer-using humanists?

Hello Dan !
It’s a very good idea. In France, CNRS’s centers for digital humanaties works also to this same goal. I think that OAI-ORE will be important to promove interoperability between all sciences materials reposities.

Best,

Stéphane

[…] faite par Dan Cohen de l’association entre Zotero et Internet Archive pour la création de Zotero Commons […]

[…] dans veille Dan Cohen’s Digital Humanities Blog » Blog Archive » Zotero and the Internet Archive Join Forces  Annotated a major alliance between the Zotero project at the Center for History and New […]

Jeff says:

I have used the Internet Archive extensively for a few years, as well as Zotero (albeit only more recently), and I must say that this partnership has amazing potential. The combined power of these research tools is awesome, and I will definitely be making use of this whenever I can.
P.S. reCAPTCHA is a nice project too – glad to see you’re using it… will the OCR scans from Zotero Commons be used for reCAPTCHA too, or will correcting documents be the prerogative of the user?

Dan Cohen says:

@Jeff: it’s conceivable that texts donated to the Zotero Commons will in turn end up in Open Library and thus be queued up for use in reCAPTCHA. But initially users will get uncorrected OCR from IA’s OCR servers.

Randy Fisher says:

Hi Dan,

While I am not a techie per se (but a budding scholar practitioner) and a community-builder with WikiEducator (www.wikieducator.org), I would like to speak with you about what might be possible in terms of our many WikiEducators using Zotero as a resource and collaboratively sharing resources and references (as they create educational resources and materials; and of course, the potential for strategic collaboration on areas of mutual interest.

– Randy

[…] el empeño de mejorarlo. A finales, de diciembre, por ejemplo, el responsable del CHNM, Dan Cohen, anunciaba en su blog un acuerdo entre Zotero e Internet Archive  que promete mucho. Por toda esa labor han recibido, […]

[…] Dan Cohen: Zotero and the Internet Archive Join Forces Among other things: “The Zotero-IA alliance will create a “Zotero Commons” into which scholarly materials can be added simply via the Zotero client.” (tags: internetarchive onebiglibrary zotero) […]

[…] Zotero and the Internet Archive Join Forces (Dan Cohen) "a project to provide free and open source software and services for scholars joining together with the leading open library…" […]

[…] have been several posts lately about the much anticipated sharing feature of Zotero. The Zotero project and the Internet […]

Jane says:

Dan, Zotero is terrific. Rita Tehan and I have been giving training in the Congressional Research Service, and my work would be so much less efficient without Zotero.

About the Zotero/Internet Archive project: This would be a godsend for genealogists, if it could be made workable for them. I’ve been doing some organizing, digitizing, etc. of my grandfather’s files (and files and files) and am stymied by what to do with all these primary resources that would be so valuable to other genealogists but who would not be served by putting them into a library somewhere. I’ve also found that most resources are available only by paying through commercial genealogical “services” (not much but software), whereas many, many researchers are doing this as a hobby. If you ever want to chat about the problems and how Zotero might help, contact me. (I haven’t thought this through….)

Daniel says:

Hi Dan
I think Zotero is really great and enjoy using it.
Is the “Zotero Commons” operational already? If not when do you expect it to be ready?
Thanks

[…] can be found about this at Dan Cohen’s blog: I’m pleased to announce a major alliance between the Zotero project at the Center for History […]

[…] can be found about this at Dan Cohen’s blog: I’m pleased to announce a major alliance between the Zotero project at the Center for History […]

[…] why I got very excited when I saw the news at Dan Cohen’s blog that Zotero would be teaming up with the Internet […]

[…] upfront for others to quickly grab. Beyond these two projects however, our plan for the Zotero Commons will facilitate exactly this kind of radical transparency for primary source material in historical […]

[…] utvikler seg raskt, og blir stadig bedre. Gjennom Zotero Commons-satsingen kan det ss ut til at det vil bli utviklet en sosial fildelingstjeneste for […]

[…] things (in a sense, Zotero as a client-side mashup platform) — specifically in the context of Zotero-Internet Archive alliance.  My work for Zotero will be a big part of what I’ll be discussing on this […]

[…] that we believe will keep it far ahead of any commercial alternatives, and that will begin to enable Zotero’s communication with the Internet Archive. OK, enough of the mea culpas. Let’s get back to the exciting […]

[…] you haven’t already heard, some exciting news (and two clarifications) from Dan Cohen: I’m pleased to announce a major alliance between the […]

Dan, Zotero is already an amazing tool, its already helped me out loads in my line of work, thank you so much

[…] Cohen kondigt een samenwerking aan tussen Zotero (reference manager-plugin in Firefox) en het Internet Archive (Archief van Internet […]

[…] upfront for others to quickly grab. Beyond these two projects however, our plan for the Zotero Commons will facilitate exactly this kind of radical transparency for primary source material in historical […]

[…] you haven’t already heard, some exciting news (and two clarifications) from Dan Cohen: I’m pleased to announce a major alliance between the […]

[…] (like the Word plug-in or plug-ins for multimedia authoring or mashup creation, sharing via Internet Archive collaboration), […]

[…] I wholeheartedly encourage you to do so. Zotero has some big plans they are working on including a partnership with the Internet Archive. That’s the project that really intrigues me. Not only will scholars be able to add material […]

[…] we’re going to achieve what the visionaries in the Center for History and New Media call the Zotero Commons, a collective, networked repository of shareable, annotatable material that will facilitate […]

[…] for scholarly materials, as well as personal, restricted-access storage for scholars.” [ lire + chez Dan Cohen […]

[…] http://www.dancohen.org/2007/12/12/zotero-and-the-internet-archive-join-forces This entry was posted in Announcements, Technical. Bookmark the permalink. ← The Art of the Cat Snow → […]

[…] you haven’t already heard, some exciting news (and two clarifications) from Dan Cohen: I’m pleased to announce a major alliance between the […]

[…] Dan Cohen, “Zotero and the Internet Archives Join Forces,” Dan Cohen Blog, Dec 12, 2007, http://www.dancohen.org/2007/12/12/zotero-and-the-internet-archive-join-forces/ […]

Leave a Reply to Erik Ringmar Cancel reply