Category: Stats

Measuring the Audience of a Digital Humanities Project

Karen Motylewski of the Institute of Museum and Library Services recently pressed an audience of recent IMLS grantees to think about how they might measure the success of their digital projects. As she was well aware, academics often bristle at the quantitative measurement of the audience for their websites because it smacks of commercialism. Also, we professors and librarians and curators generally avoid taking classes in such base topics as marketing. But Karen has a point. Indeed, Roy Rosenzweig and I devote an entire chapter in Digital History to how to build an audience—not for commercial or narcissistic reasons, but because an academic digital project should be, as we say, “useful and used.” I started this blog to explain in greater depth some of the projects and research I’m working on in the digital humanities, but I also did it (as readers of my five-part series on “Creating a Blog from Scratch” will know; 1, 2, 3, 4, 5) to learn first-hand about the composition of blogs and the technologies behind them. Writing my own code for this blog forced me to examine in detail—and occasionally rethink—some blogging conventions (technical, design, and content). And one of the benefits of doing so has been a realization that I have significantly underestimated the power of RSS. I now think it may be the best measurement of utility for an academic website, far better than server logs or other quantitative measurements. Let me explain why.

Think of your reading habits—specifically, periodicals. You probably subscribe to a newspaper, a magazine or two (or three), and perhaps some academic or specialist journals. Every time you go to the dentist, you also probably voraciously read all of those salacious magazines and lifestyle handbooks you don’t subscribe to. If you’re in a particularly bad waiting room, you probably read anything that’s lying around, even if you would never buy those magazines at a newstand. As anyone in the magazine or newspaper business will tell you, what they really want is subscribers, not casual, one-time readers. Subscribers have shown a level of interest in, and dedication to, a periodical that is several levels above all other readers.

Now look carefully at web server logs—the trail of a website’s readers. Most visitors to a typical website are like the third type of magazine reader—simply passing through on the way to get their cavities filled. They generally come from search engines, quickly scan a page, and leave, their IP address never to be seen again.

Moreover, up to three-quarters of traffic to most websites is from bots (i.e., Google’s indexing spider)—a machine audience that you probably care little about, except as a way to drive traffic to your site from search requests. On this site in March 2006, the human audience looked at about 10,000 pages; machines requested over 26,000 pages. This doesn’t even take into account “server spam,” which consists of fake requests to your server to make it look like another website is sending a lot of traffic your way. In March, superhott.com was the number one “referrer” to this blog. Great.

So now we are down to about 10% of the top line number of “visitors” to your website. You are likely getting depressed. But here’s where another point Roy and I make comes into play: “think about community, not numbers of visitors.” That other 10% includes a number of people who love your site and what it has to offer but only visit every once in a while.

Then there are the subscribers. RSS truly provides an online analog to periodical subscriptions; “subscriptions” is a very good word for it since subscribers receive each update automatically. RSS finally allows digital humanities projects to assess how many people are really committed to a site. Notably, this number may or may not follow overall site traffic patterns. For instance, here’s a comparison of server logs for this site with RSS subscriptions:

In the noise of all of the bot traffic and disinterested visitors (top chart; the orange bar represents unique visitors, the dark blue is page views), I’m grateful that subscriptions to this blog (bottom chart) have climbed steadily since its inception four months ago. Should this blog have the enormous traffic of a BoingBoing? No. That’s not why I started it. I’m trying to reach a fairly specific audience that is several orders of magnitude smaller than the big tech/geek audience for BoingBoing. Success means reaching and having a conversation with those people—the people who I believe are doing critical work for the future of education, libraries, and the humanities—not with a mass audience. I hope this site is slowly creeping toward that modest goal. By tracking RSS subscriptions, other digital humanities projects can also see if they’re reaching their envisioned audience.

But how do you use RSS if your site isn’t a blog? If your site is a digital collection or archive, you can add a “news about this site” or “new features/new additions” RSS feed, as we have done for the Hurricane Digital Memory Bank. If your project involves software development, you can put code update announcements into an RSS feed. Even if your site is relatively static, new services such as watchthiswebsite.com will send out notifications of site changes to interested parties. Once you have an RSS feed (you should link to it from your home page so that RSS-aware browsers can find it quickly), you can then use services such as Feedburner to track RSS subscriptions more carefully.

With all of its faults and problems, I suspect we will soon be saying, “The server log is dead.” Long live RSS.

The Final Four’s Impact on Websites

I work at George Mason University. Unless you live off the grid (and if so, how are you reading this?), you’ve probably heard that our basketball team is in the Final Four this weekend. There has been a great deal of talk around campus about the impact this astonishing feat will have on the university’s stature and undergraduate admissions. But what about its effect on Mason’s websites? A bit of unscientific evidence from Alexaholic, which creates website traffic graphs using data from Amazon.com’s Alexa web service:

Our domain has gone from being about the 5300th most popular on the web to about 2100th since Mason was selected (controversially) for the tournament on March 12. OK, we’re not exactly in Yahoo territory, but we’ve bypassed dozens of other universities in our steep two-week climb.