Categories
History Research Software Tagging Zotero

Social and Semantic Computing for Historical Scholarship

Under the assumption that many readers of this blog don’t receive the American Historical Association’s magazine Perspectives, you might be interested in this article I wrote for the May 2007 issue. In the piece I discuss the Zotero project’s connection to several recent trends in computing, and think ahead to what the Zotero server might mean for academic fields like history.

Categories
History Mathematics Religion Victorian

Equations from God

“On September 23, 1846, the Berlin astronomer Johann Gottfried Galle scanned the night sky with a telescope and found what he was looking for—the faint light of the planet Neptune. Excitement about the discovery of an eighth planet quickly spread across Europe and America, generating a wave of effusive front-page headlines…Neptune was the first heavenly body found by mathematical prediction. Without peering into the sky at all, two mathematicians independently calculated the location of the planet through geometrical analysis and the laws of gravitation [after noticing] Uranus’s orbital irregularities [and told Galle where to look]…This remarkable aspect of the discovery of Neptune was not lost upon contemporaries. To many it signaled a new era of human knowledge [in which mathematicians were] potent sorcerers who conjured and commanded the supreme realm of Truth.” So begins Equations from God, my new book. I’ve been careful on this blog to stay on topic, i.e., only discuss digital matters, but as many of you know I also do work that is very much analog. And since one only comes out with a book once in a while, I’m taking the liberty of using the blog today as a platform to tell you why you might want to pick up a copy of Equations from God and read it.

Beginning with Plato and ending on the eve of the twentieth century, Equations from God tells the story of how and why so many Europeans and Americans came to see mathematics as a divine language, a way to ascend above the petty differences of mankind and commune with the mind of the Deity. Although it focuses on an ostensibly technical topic, it is written in a plainspoken way that makes the world of the mathematician accessible to a general audience, and it contextualizes that world within the religious, social, and political upheaval of the Victorian era. And it reveals surprising ideas from many unpublished works such as diaries, notebooks, sermons, and letters—ideas that remain remarkably relevant in today’s world. I think it also provides a good introduction to the intellectual and cultural debates and tensions of the nineteenth century.

Readers of this blog will likely find chapters and sections of interest, such as…

…the tale of George Boole, the brilliant, meek creator of the logic that runs our computers and our searches, who left England in his early thirties to teach mathematics in Ireland, only to find himself under siege during the Great Famine and the outbreak of Irish nationalism…

…the life of the greatest American mathematician of the nineteenth century, the pompous and cantankerous Harvard professor Benjamin Peirce, who refused to teach students who were insufficiently smart and who would end his math classes by exclaiming, “Gentlemen, there must be a God!”…

…the strange world of circle-squarers—amateur mathematicians who believed that pi was not what professional mathematicians said it was, and thought they had found its true value through mystical means…

…rare books such as The Lady’s Diary, a combination of astronomical knowledge, riddles, and math problems “designed for the use and diversion of the fair sex”…

…and much more. The book has been out for a few weeks now and so should be on bookshelves near you. While the list price is more for the “academic market,” as university presses like to call it (i.e., it’s listed at $50), through the power of the Internet you can find it for much less by looking at PriceGrabber or your favorite comparison site, or by going straight to A1 Books ($30), Barnes and Noble ($40 or $36 for members), or get it directly from The Johns Hopkins University Press ($40 with this special discount from the author).

Categories
History

Readings for a Field in Digital History

An incredibly helpful list from Bill Turkel of nearly a hundred books that either directly or indirectly address issues central to the study of digital history.

Categories
Google History Maps Mashups

Mapping Recent History

As the saying goes, imitation is the sincerest form of flattery. So at the Center for History and New Media, we’re currently feeling extremely flattered that our initiatives in collecting and presenting recent history—the Echo Project (covering the history of science, technology, and industry), the September 11 Digital Archive, and the Hurricane Digital Memory Bank—are being imitated by people using a wave of new websites that help them locate recollections, images, and other digital objects on a map. Here’s an example from the mapping site Platial:

And similar map from our 9/11 project:

Of course, we’re delighted to have imitators (and indeed, in turn we have imitated others), since we are trying to disseminate as widely as possible methods for saving the digital record of the present for future generations. It’s great to see new sites like Platial, CommunityWalk, and Wayfaring providing easy-to-use, collaborative maps that scattered groups of people can use to store photos, memories, and other artifacts.

Categories
Academia History Information Theory Technology Text Mining

No Computer Left Behind

In this week’s issue of the Chronicle of Higher Education Roy Rosenzweig and I elaborate on the implications of my H-Bot software, and of similar data-mining services and the web in general. “No Computer Left Behind” (cover story in the Chronicle Review; alas, subscription required, though here’s a copy at CHNM) is somewhat more polemical than our recent article in First Monday (“Web of Lies? Historical Knowledge on the Internet”). In short, we argue that just as the calculator—an unavoidable modern technology—muscled its way into the mathematics exam room, devices to access and quickly scan the vast store of historical knowledge on the Internet (such as PDAs and smart phones) will inevitably disrupt the testing—and thus instruction—of humanities subjects. As the editors of the Chronicle put it in their headline: “The multiple-choice test is on its deathbed.” This development is to be praised; just as the teaching of mathematics should be about higher principles rather than the rote memorization of multiplication tables, the teaching of subjects like history should be freed by new technologies to focus once again (as it was before a century of multiple-choice exams) on more important principles such as the analysis and synthesis of primary sources. Here are some excerpts from the article.

“What if students will have in their pockets a device that can rapidly and accurately answer, say, multiple-choice questions about history? Would teachers start to face a revolt from (already restive) students, who would wonder why they were being tested on their ability to answer something that they could quickly find out about on that magical device?

“It turns out that most students already have such a device in their pockets, and to them it’s less magical than mundane. It’s called a cellphone. That pocket communicator is rapidly becoming a portal to other simultaneously remarkable and commonplace modern technologies that, at least in our field of history, will enable the devices to answer, with a surprisingly high degree of accuracy, the kinds of multiple-choice questions used in thousands of high-school and college history classes, as well as a good portion of the standardized tests that are used to assess whether the schools are properly “educating” our students. Those technological developments are likely to bring the multiple-choice test to the brink of obsolescence, mounting a substantial challenge to the presentation of history—and other disciplines—as a set of facts or one-sentence interpretations and to the rote learning that inevitably goes along with such an approach…

“At the same time that the Web’s openness allows anyone access, it also allows any machine connected to it to scan those billions of documents, which leads to the second development that puts multiple-choice tests in peril: the means to process and manipulate the Web to produce meaningful information or answer questions. Computer scientists have long dreamed of an adequately large corpus of text to subject to a variety of algorithms that could reveal underlying meaning and linkages. They now have that corpus, more than large enough to perform remarkable new feats through information theory.

“For instance, Google researchers have demonstrated (but not yet released to the general public) a powerful method for creating ‘good enough’ translations—not by understanding the grammar of each passage, but by rapidly scanning and comparing similar phrases on countless electronic documents in the original and second languages. Given large enough volumes of words in a variety of languages, machine processing can find parallel phrases and reduce any document into a series of word swaps. Where once it seemed necessary to have a human being aid in a computer’s translating skills, or to teach that machine the basics of language, swift algorithms functioning on unimaginably large amounts of text suffice. Are such new computer translations as good as a skilled, bilingual human being? Of course not. Are they good enough to get the gist of a text? Absolutely. So good the National Security Agency and the Central Intelligence Agency increasingly rely on that kind of technology to scan, sort, and mine gargantuan amounts of text and communications (whether or not the rest of us like it).

“As it turns out, ‘good enough’ is precisely what multiple-choice exams are all about. Easy, mechanical grading is made possible by restricting possible answers, akin to a translator’s receiving four possible translations for a sentence. Not only would those four possibilities make the work of the translator much easier, but a smart translator—even one with a novice understanding of the translated language—could home in on the correct answer by recognizing awkward (or proper) sounding pieces in each possible answer. By restricting the answers to certain possibilities, multiple-choice questions provide a circumscribed realm of information, where subtle clues in both the question and the few answers allow shrewd test takers to make helpful associations and rule out certain answers (for decades, test-preparation companies like Kaplan Inc. have made a good living teaching students that trick). The ‘gaming’ of a question can occur even when the test taker doesn’t know the correct answer and is not entirely familiar with the subject matter…

“By the time today’s elementary-school students enter college, it will probably seem as odd to them to be forbidden to use digital devices like cellphones, connected to an Internet service like H-Bot, to find out when Nelson Mandela was born as it would be to tell students now that they can’t use a calculator to do the routine arithmetic in an algebra equation. By providing much more than just an open-ended question, multiple-choice tests give students—and, perhaps more important in the future, their digital assistants—more than enough information to retrieve even a fairly sophisticated answer from the Web. The genie will be out of the bottle, and we will have to start thinking of more meaningful ways to assess historical knowledge or ‘ignorance.'”

Categories
CHNM Conferences and Workshops Copyright History Preservation Web Design

Doing Digital History June 2006 Workshop

If your work deals in some way with the history of science, technology, or industry, and you would like to learn how to create online history projects, the Echo Project at the Center for History and New Media is running another one of our free, week-long workshops. The workshop covers the theory and practice of digital history; the ways that digital technologies can facilitate the research, teaching, writing and presentation of history; genres of online history; website infrastructure and design; document digitization; the process of identifying and building online history audiences; and issues of copyright and preservation.

As one of the teachers for this workshop, I can say somewhat immodestly that it’s really a great way to get up to speed on the many (sometimes complicated) elements necessary for website development. Unfortunately space is limited, so be sure to apply online by March 10, 2006. The workshop will take place from June 12-16, 2006, at George Mason University’s Arlington campus, right outside of Washington, DC. It is co-sponsored by the American Historical Association and the National History Center, and funded by the Alfred P. Sloan Foundation. There is no registration fee, and a limited number of fellowships are available to defray the costs of travel and lodging for graduate students and young scholars. Hope to see you there!

Categories
Archives History Preservation

Digital History on Focus 580

From the shameless plug dept.: If you missed Roy Rosenzweig’s and my appearance on the Kojo Nnamdi Show, I’ll be on Focus 580 this Friday, February 3, 2006, at 11 AM ET/10 AM CT on the Illinois NPR station WILL. (If you don’t live in the listening area for WILL, their website also has a live stream of the audio.) I’ll be discussing Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web and answering questions from the audience. If you’re reading this message after February 3, you can download the MP3 file of the show.

Categories
Google History Search Text Mining

10 Most Popular History Syllabi

My Syllabus Finder search engine has been in use for three years now, and I thought it would be interesting to look back at the nearly half-million searches and 640,000 syllabi it has handled to see which syllabi have been the most popular. The following list was compiled by running a series of calculations to determine the number of times Syllabus Finder users glanced at a syllabus (had it turn up in a search), read a syllabus (actually went from the Syllabus Finder website to the website of the syllabus to do further reading), and “attractiveness” of a syllabus (defined as the ratio of full reads to mere glances). Here are the most popular history syllabi on the web.

#1 – U.S. History to 1870 (Eric Mayer, Victor Valley College, total of 6104 points)

#2 – America in the Progressive Era (Robert Bannister, Swarthmore College, 6000 points)

#3 – The American Colonies (Bruce Dorsey, Swarthmore College, 5589 points)

#4 – The American Civil War (Sheila Culbert, Dartmouth College, 5521 points)

#5 – Early Modern Europe (Andrew Plaa, Columbia University, 5485 points)

#6 – The United States since 1945 (Robert Griffith, American University, 5109 points)

#7 – American Political and Social History II (Robert Dykstra, University at Albany, State University of New York, 5048 points)

#8 – The World Since 1500 (Sarah Watts, Wake Forest University, 4760 points)

#9 – The Military and War in America (Nicholas Pappas, Sam Houston State University, 4740 points)

#10 – World Civilization I (Jim Jones, West Chester University of Pennsylvania, 4636 points)

This is, of course, a completely unscientific study. It obviously gives an advantage to older syllabi, since those courses have been online longer and thus could show up in search results for several years. On the other hand, the ten syllabi listed here range almost uniformly from 1998 to 2005.

Whatever its faults, the study does provide a good sense of the most visible and viewed syllabi on the web (high Google rankings help these syllabi get into a lot of Syllabus Finder search results), and I hope it provides a sense of the kinds of syllabi people frequently want to consult (or crib)—mostly introductory courses in American history. The variety of institutions represented is also notable (and holds true beyond the top ten; no domination by, e.g., Ivy League schools). I’ll probably do some more sophisticated analyses when I have the time; if there’s interest from this blog’s audience I’ll calculate the most popular history syllabi from 2005 courses, or the top ten for other topics. If you would like to read a far more elaborate (and scientific) data-mining study I did using the Syllabus Finder, please take a look at “By the Book: Assessing the Place of Textbooks in U.S. Survey Courses.”

[How the rankings were determined: 1 point was awarded for each time a syllabus showed up in a Syllabus Finder search result; 10 points were awarded for each time a Syllabus Finder user clicked through to view the entire syllabus; 100 points were awarded for each percent of “attractiveness,” where 100% attractive meant that every time a syllabus made an appearance in a search result it was clicked on for further information. For instance, the top syllabus appeared in 1211 searches and was clicked on 268 times (22.13% of the searches), for a point total of 1211 + (268 X 10) + (22.13 X 100) = 6104.]

Categories
Archives History News Preservation Web

Kojo Nnamdi Show Questions

Roy Rosenzweig and I had a terrific time on The Kojo Nnamdi Show today. If you missed the radio broadcast you can listen to it online on the WAMU website. There were a number of interesting calls from the audience, and we promised several callers that we would answer a couple of questions off the air; here they are.

Barbara from Potomac, MD asks, “I’m wondering whether new products that claim to help compress and organize data (I think one is called “C-Gate” [Kathy, an alert reader of his blog, has pointed out that Barbara probably means the giant disk drive company Seagate]) help out [to solve the problem of storing digital data for the long run]? The ads claim that you can store all sorts of data—from PowerPoint presentations and music to digital files—in a two-ounce standalone disk or other device.”

As we say in the book, we’re skeptical of using rare and/or proprietary formats to store digital materials for the long run. Despite the claims of many companies about new and novel storage devices, it’s unclear whether these specialized devices will be accessible in ten or a hundred years. We recommend sticking with common, popular formats and devices (at this point, probably standard hard drives and CD- or DVD-ROMs) if you want to have the best odds of preserving your materials for the long run. The National Institute of Standards and Technology (NIST) provides a good summary of how to store optical media such as CDs and DVDs for long periods of time.

Several callers asked where they could go if they have materials on old media, such as reel-to-reel or 8-track tapes, that they want to convert to a digital format.

You can easily find online some of the companies we mentioned that will (for a fee) transfer your own media files onto new devices. Google for the media you have (e.g., “8-track tape”) along with the words “conversion services” or “transfer services.” I probably overestimated the cost for these services; most conversions will cost less than $100 per tape. However, the older the media the more expensive it will be. I’ll continue to look into places in the Washington area that might provide these services for free, such as libraries and archives.

Categories
Archives History Preservation Web

Digital History on The Kojo Nnamdi Show

From the shameless plug dept.: Roy Rosenzweig and I will be discussing our book Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web this Tuesday, January 10, on The Kojo Nnamdi Show. The show is produced at Washington’s NPR station, WAMU. We’re on live from noon to 1 PM EST, and you’ll be able to ask us questions by phone (1-800-433-8850), via email (kojo@wamu.org), or through the web. The show will be replayed from 8-9 PM EST on Tuesday night, and syndicated via iTunes and other outlets as part of NPR’s terrific podcast series (look for The Kojo Nnamdi Show/Tech Tuesday). You’ll also be able to get the audio stream directly from the show’s website. I’ll probably answer some additional questions from the audience in this space.