Webbys, Google, and The Ultimate Computer

First of all, a major round of congratulations to FactCheck.org which has been nominated for a 2007 Webby Award (in two categories)! [APPLAUSE] They’ve made it to the finals, in fact, and the RS wishes them luck in nabbing that sucker. For what my opinion is worth (exactly what you have paid for it), I’ve been in love with this organization since they first appeared. They truly are impartial in the dirt they uncover both from left and right wings of the political spectrum, and their analysis is consistently thorough and well-researched. For that alone they deserve to win. If you feel compelled to help them out a bit towards said winning, go vote for them here. If not, well, I have another question for you.

Clearly, Google now believes that it can catalog books for the Library of Congress. Well, maybe it can. That’s unfair–of course they can. Should they, though? That’s a different question.

The ALA’s announcement fails to truly inspire faith in the future of cataloging, but here is the bit that I found the least inspiring:

Under the new arrangement, MARC records for titles from Google Book
Search publishing partners will be created by the Google indexing
system. The original cataloging of these works will be accomplished
automatically by a software program.

“Think of it as an electronic brain,” said Richard Sumner, a Google
representative speaking about the computing equipment involved in the
new alliance. “Our artificial intelligence systems can fully handle
descriptive and subject cataloging.”

A senior administration official at the Library of Congress, speaking
on condition of anonymity, said that the pilot program has the
potential to expand to the point of eliminating the need for any
professional catalogers. The source also mentioned plans to migrate
the OPAC to LibraryThing and turn the American Memory site into a Wiki.

I officially started gasping for the air at the phrase "The original cataloging of these works will be accomplished
automatically by a software program." I’m really not sure which thought is worse, the idea that cataloging in the United States’ primary source library may eventually be reduced to assembly line work in California (or abroad) or that "a software program" will be doing it. Which software program? Who developed it? Does it work in real life as opposed to the lab? If so, what’s it called, what does it do, and who developed it? (I mean besides "a famous software engineer.") Did the ALA even ask these questions? If not, then why not? These things matter. If this application is being used elsewhere, can the ALA tell us where and what kind of track record it has? If it’s proprietary Google knowledge, why can’t they say that instead? "A software program" can mean anything up to and including M-5 from Star Trek. ("They attacked this unit. This unit must survive.")

I’m probably sounding paranoid. I am paranoid. I don’t like the fact that a humongous black box with the word GOOGLE stenciled on the side is deciding on its own how to route search engine metadata from both ends of the MARC food chain. I wasn’t too thrilled with Sumner’s suggestion that we "think of it as an electronic brain" because Google’s “artificial intelligence systems can fully handle
descriptive and subject cataloging.” No doubt they can, but how are they going to do it and how will it affect search strategies within the LOC collection? You know me. I love computers. I work with computers every day and even I don’t like to be told that I should just sit back and let the M-5 do all the work.

The bit about potentially "eliminating the need for any
professional catalogers" doesn’t sit well for a couple of reasons. Part of me is glad that the "senior administration official" ( a.k.a., "A Famous Librarian") didn’t say that it could potentially eliminate the need for human catalogers. Just the professional catalogers, suggesting that any spot checking to be done would be by paraprofessionals or maybe they’d do away with the spot checks altogether seeing as how the M-5 will not make a mistake. In other words, "You can trust us, we’re Google!" Yeah. Of course. Something else that springs to mind is the possibility that since Google is feeding books into this thing and taking notes as it squirts MARC records out its ass, Google would likely be limiting the LOC controlled vocabulary to something its black box could deal with. For that matter they might just as easily substitute their own subject headings ("GSH"). There’s no rule that says it can’t, unless the LOC actually stipulates in its contract with Google that it make use of the full LCSH, AACR2, and so on. Even if the stipulation is in there, who’s going to enforce it? The Haliburton administration? How about the "High Tech at Any Price" democrats? Call me a cynic but somehow I doubt it.

I’m not going to dwell on the idea of American Memory being Wikified into oblivion–I’m having a hard enough time breathing as it is. Migrating the LOC’s OPAC to Library Thing could possibly be worthwhile as long as Google’s cataloging standards overlap with LCSH and such so that the OPAC has something to point to. But see the previous paragraph for the problems with that little matter of controlled vocabulary quality.

The worst part of this was the realization that none of this really surprised me. This is really just killing two birds with one stone: Bird One is where you privatize a government function by handing its chief preservation arm to a gigantic for-profit mega-corporation, run by brilliant (if not entirely ethical) PhDs who fervently believe that computers are the perfect repository of all knowledge. Bird Two is where you allow (or maybe encourage) that mega-corporation to use a secret patented process to electronically describe every single book in the collection according to its own private standards which may overlap with the standards of Libraryland, or not. With each day that passes the flow of free information gets a little narrower and nobody is the wiser.

Until, of course, that day in the far future where all the librarians get together and realize that nobody has managed to locate a single book on any topic within the walls of the Library Of Congress in years. But long before that day arrives, it will be too late for anyone to do much about it.

Webbys, Google, and The Ultimate Computer

Leave a Reply Cancel reply