Search News


Browse Archives

News

Maelstrom Over Metadata

November 14, 2008

Share This Story

FREE Daily News Alerts

Advertisement

A debate is carrying on in the undercurrents of the academic Web, pitting those who defend libraries' core mission of open access against the membership organization that collects and operates a massive online catalog on which many of them rely.

Early this month, the OCLC (for Online Computer Library Center) announced the first significant change in its policies governing how libraries use and share bibliographic records since 1987 -- years before the World Wide Web existed. Some of those rules were considered overly vague or out of touch, representing an era before Google searches and online catalogs transformed the way students and researchers use library databases.

A major part of libraries' evolution since then has been a demand for more openness and the ability to search for materials that might exist at any number of institutions worldwide, driven by the ubiquity of search engines and an increasing commitment to digitizing texts. But those trends place them on a collision course with OCLC, which was originally founded by libraries to collect and store records of their holdings so that they wouldn't have to be created anew with each acquisition.

That partnership has grown into the large, member-supported organization today that owns WorldCat, which holds tens of millions of online records that members can use, relieving individual libraries of laboriously typing up so-called metadata -- information about individual holdings like the title, author and publisher, and plenty more -- and, in the process, standardizing the catalogs they use for their books and reference works.

In an attempt to protect WorldCat and the resources needed to keep it running, while making it sufficiently accessible to its members, OCLC announced a policy change that would have placed a notice in each record to the effect that it is governed by the WorldCat terms contained in an accompanying Web address -- terms that could presumably change over time. Libraries would also be encouraged to add the text to a specific field within each of their own records that originated from WorldCat.

Some bloggers interpreted the change as a power grab, an attempt to block libraries from using records for purposes that could conflict with OCLC's goals. For example, some libraries are considering using their records to generate revenue to support their own growing operations, and that could fall into OCLC's "commercial use" prohibition. Print-on-demand services, which make use of WorldCat records, could be affected; so could planned "discovery" interfaces that span dozens of libraries.

"From [OCLC's] perspective, it makes a lot of sense for them to want to assert and be overt about their rights to this material because WorldCat really represents ... a major portion of their revenue, and it also ... supports essential services for libraries such as resources sharing," said Anne R. Kenney, the Carl A. Kroch University Librarian at Cornell University.

Debates over who owns the rights to the records -- and whether it's possible to copyright them at all -- aren't new and have led to open-source alternatives, such as OpenLibrary, whose database can be updated by contributors and is free and available for any purpose.

In any case, the initial reaction to the policy change was swift, complete with an online petition.

"Not satisfied with controlling the world's largest source of book information, it wants to take over all the smaller ones as well," wrote Aaron Swartz, one of the founders of OpenLibrary and a widely read Internet thinker, on his blog Thursday. "It's now demanding that every library that uses WorldCat give the copyright to all its catalog records to OCLC. It literally is asking libraries to put an OCLC copyright notice on every book record in their catalog. It wants to own every library."

By the time news of the policy, which is to take effect in February, spread across the blogosphere, OCLC posted a new draft softening some of its requirements -- for example, by making it optional to use or keep the text referring to WorldCat's policies and clarifying that non-commercial use of the records was generally protected, except in cases where it could interfere with OCLC's mission. And while the shift signals some openness to members' concerns, some still aren't satisfied, especially with the way the initial decision was made.

"They saved individual libraries a lot of money over time by collective resource sharing and cataloging. But having said that, I think part of the problem is that in their need to assert their rights, they did not broadly consult ahead of time of the release of this policy," Kenney said.

Terry Reese, the Gray Chair for Innovative Library Services at Oregon State University Libraries, said in an e-mail that it is partially a philosophical issue: "At its core, libraries have always been about providing access to our information and our metadata. We don't make value judgments as to why people may want/need to use our materials -- but that's essentially what OCLC is doing now (whether intentional or not)."

He continued, "As OCLC is oft to bring up, WorldCat is a member created resource -- yet, OCLC seems to be the only organization that is allowed to have unfettered access to that data. There are many ways to protect the membership's investment in the data that has been created."

But for OCLC, the issue is one of adapting to a Google-oriented world without sacrificing the value of WorldCat. In a blog post acknowledging the criticism, Karen Calhoun, OCLC's vice president for WorldCat and metadata services, wrote: "To play the role it is now playing on behalf of libraries, OCLC needs to be a player on the Web, and not just any player, but an influential one. It therefore needs to be a Web company, with data sharing policies and practices appropriate to the Web."

Advertisement
Advertisement

Matching Jobs

Comments on Maelstrom Over Metadata

  • Open Library & a petition
  • Posted by Michael on November 14, 2008 at 7:10am EST
  • Readers may want to continue here: Aaron Swartz gives a short overview of the genesis of OCLC and introduces Open Library an alternative and open-source initiative. At the bottom of the page is a link to a petition and a link to the Open Library project.
    http://www.aaronsw.com/weblog/oclcscam

  • More info please
  • Posted by Robert Matz on November 14, 2008 at 7:55am EST
  • I went to the site that Michael suggested and read this: "It's not just Open Library that's at risk here -- LibraryThing, Zotero, even some new Wikipedia features being developed are threatened." I find this chilling, but would love to hear more discussion of it.

    I would say that if it is true that every record in a library will get an OCLC terms of use statement, this will be sad, especially for our students. Another non-proprietary space bites the dust.

  • Posted by Betty Drees Johnson on November 14, 2008 at 11:25am EST
  • It should be emphasized that OCLC did not create these records. The original data came from Library of Congress records, created at public expense, and the rest from member libraries and now other national libraries. OCLC's contribution (paid for with our membership fees and per/use charges) was to organize these records, index them, and make them accessible (again for an additional fee per record) to libraries. OCLC is now claiming copyright for the records which libraries from around the world created and housed at OCLC.

  • Copyrightable?
  • Posted by Jack on November 14, 2008 at 12:00pm EST
  • Are catalogue records really copyrightable?

  • OCLC
  • Posted by Pam , Acquisitions and Serials Librarian on November 14, 2008 at 1:50pm EST
  • I remember it well--when there was no OCLC. I worked for Florida Com-Cat in Orlando, FL at the very beginning of OCLC, putting in records from major Florida library systems. I assume that they had other such operations in each state. We had microfilmed shelf list catalog cards that we looked at and we searched for and if there was no record in the database in Dublin, Ohio that matched it, we created one from the shelf list card. So people tell me--Wow that explains a lot! After our libraries contributed to their database, it does seem pretty nervy to think of the database as belonging to them. It was a collaborative effort from all of the states. They can only copywright what they've done with the raw data. The raw data belongs to the libraries that contributed.

  • Odd
  • Posted by Val on November 14, 2008 at 2:15pm EST
  • As Betty points out, the records are produced by the members for sharing among the members. Records represent a thing. Your birth certificate records your birth; it is a record but it did not give you birth. Therefore, claiming ownership of a record is like saying that you have some right to the thing. OCLC has a right to be recognized for the database and software used to search that database (which they are paid for) but the records just describe the things. A patron could copy down everything from a bibliographic record but it will not get him the thing (book, CD, DVD, mp3, whatever). Only the library, bookstore, online market place can actually deliver the thing. The thing that OCLC owns (software, search taxonomies, structure of the records, etc.) can all be protected without a copyright notice. IMHO

  • OCLC
  • Posted by Joyce Latham , Asst. Professor at UWM on November 14, 2008 at 6:05pm EST
  • Perhaps OCLC is trying to maintain a position it has, in reality, already lost.

  • The real conflict
  • Posted by Glenn Bunton , Head of Systems Development at Old Dominion University Libraries on November 17, 2008 at 11:20am EST
  • In the end, the heart of the issue lies in the conflict between a commercial, economically driven organization (OCLC) and non-commercial, service driven organizations (libraries)with the commercial organization beginning to see their business model crumble in the world of new information technologies. OCLC's business and service models were highly valued 30 years ago when the technology did not exist for individual organizations to manage and, more importantly, share their cataloging (metadata) information. With the rise of the internet, open-source, Dublin Core, OAI-PMH, XML and other technology developments individual organizations have far more options for how to create, manage, and share such information. The steps being taken by OCLC are simply the expected moves of a commercial venture seeking to retain some hold on its market. I suspect in the long run they will not succeed but it is no surprise they are trying.

  • Peer to Peer metadata creation and sharing
  • Posted by Jerome Yavarkovky , University Librarian Emeritus at Boston College on November 17, 2008 at 7:05pm EST
  • Libraries are sustaining a proprietary model for catalog record sharing based on centralized mainframe processing when nowadays "the network is the computer" and metadata should be harvested from peer sources through a secure, scalable, structured, open source peer to peer system. Libraries, vendors, publishers and booksellers could participate.
    The goal is to create an open source environment for applications development—record sharing, authority and quality control, cataloging, acquisitions, resource sharing, interlibrary loan, links to digital resources, bookselling, media downloads, etc. Open source access would also invite participation by smaller libraries, and would provide a foundation for open access to resources in the future.

  • OCLC, shared cataloguing and Open Access
  • Posted by Anthony Ferguson , University Librarian at University of Hong Kong on November 22, 2008 at 7:05pm EST
  • For the past several years my University's library has been the beneficiary of OCLC's shared cataloguing model: We annually catalogue thousands of printed and electronic Chinese books for which we are paid. The fees we receive cover our retrospective e-book cataloguing project costs. OCLC then resells our records to other libraries. We also buy records for virtually all of the western language books we acquire. I sense that while the vocabulary being used to describe OCLC's revised policy, which allows for non commercial uses of its/our records, clothes the debate as anti-open access with OCLC as the culprit, what is really being argued is the right for groups other than OCLC to resell these records. This is a free lunch argument: I have a right not only to a free lunch, but also a right to sell it to others. Unless OCLC has the sole right to resell my library's records to others, it won't be able to pay us. So, while my library is a staunch supporter of the open access movement, I think what is needed here is a willingness to permit OCLC the right permitted to those employing Creative Commons licenses: You can use my material for scholarly purposes but you cannot use it for commercial purposes. We recently opened a Creative Commons site at my university: http://creativecommons.org/international/hk/ .

    Tony