My last session on Friday at OLA 2011 was a session in which some of the most fundamental issues in bibliographic description were thoroughly explored. Metadata practitioners and digital project managers from the University of Alberta and Concordia University had many discussions over the previous months, sometimes over a beer, about some deep misconceptions about bibliographic description, specifically in the field of metadata for institutional repositories of digital files.
The metadata practitioners have grown to appreciate that good metadata—and more of it—is the solution to problems they are seeing. There is a competing mindset out there—the “good enough” mindset. This competing mindset grew out of the initial effects of digital technology. Digital tools have affected access to metadata and how metadata is created, but these tools have been accompanied by a belief that less description is required, in part because it is time-consuming and in part because technology is just assumed to solve all problems.
But less metadata affects quality. Quality is being sacrificed for the convenience of quick access with incomplete or improper metadata, driven by these beliefs:
– “If it’s not online, it doesn’t exist”
– users want to be information creators as well as consumers
– users can be given the tools to replace metadata creation by professionals
The speakers at this session argued quite strenuously (and with some prolixity) that there is a misunderstanding among those who adhere to the “good enough” approach. People are seriously underestimating the risks with the “good enough” approach.
The speakers made reference to Charles Cutter and his beliefs about how users should be best served by catalogues.
Rules for a Dictionary Catalog, by Charles Cutter, 4th ed. 1904. Hosted by the University of Texas Digital Library.
The metadata for Rules for a Dictionary Catalog, in Dublin Core XML format:
|<?xml version=”1.0″ encoding=”UTF-8″ ?>
– <oai_dc:dc xmlns:oai_dc=”http://www.openarchives.org/OAI/2.0/oai_dc/” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:dc=”http://purl.org/dc/elements/1.1/” xsi:schemaLocation=”http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd“>
<dc:publisher>Government Printing Office</dc:publisher>
<dc:description>Volume of cataloging rules created prior to the widespread availability of Library of Congress cataloging. Includes definitions and instructions on entry and style. Additional articles describe cataloging of special publications such as manuscripts, music, and maps and atlases.</dc:description>
<dc:format>173 p. 23 cm.</dc:format>
<dc:creator>Cutter, Charles A. (Charles Ammi), 1837-1903</dc:creator>
<dc:creator>U.S. Bureau of Education</dc:creator>
<dc:contributor>Cutter, W. P. (William Parker), 1867-1935.</dc:contributor>
<dc:contributor>Ford, Worthington Chauncey, 1858-1941</dc:contributor>
<dc:contributor>Phillips, Philip Lee, 1857-1924</dc:contributor>
<dc:contributor>Sonneck, Oscar George Theodore, 1873-1928</dc:contributor>
<dc:title>Rules for a dictionary catalog, by Charles A. Cutter, fourth edition, rewritten</dc:title>
<dc:title>U.S. Bureau of Education. Special report on public libraries–pt. II</dc:title>
<dc:identifier>govno: I 16.2:L 61/5</dc:identifier>
User convenience has always been paramount in cataloguing and metadata creation. Charles Cutter, one the key figures in the history of cataloguing, had placed user convenience as the paramount concern for cataloguers. Cataloguers have a role to play in saving the user time, and in disambiguating—providing clarity about the objects in the catalogue.
Basic metadata such as title, author, and date are not time-consuming (to correct some misperceptions among the “good enough” people), but accuracy in these elements is beneficial to end-users, and so attention to this metadata should not be abandoned. Users see subject headings, abstracts and classification as important, and these metadata elements are resource intensive to create.
But the new digital technology tools such as full-text searching and relevancy ranking have led to the thinking that metadata is now a simple process. These new tools are suddenly treated as a solution that can replace metadata. The notion held by some is that traditional metadata creation is an artisanal activity, and that mass-production can adequately replace it. But this is happening through a pressure to separate user needs from user wants, with a misunderstanding of what those needs and wants are.
Descriptive metadata has historically involved a common sense approach. But the misconception is that this is rote work, rather than a genuine intellectual exercise. The misunderstanding has spread to the idea that intellectual engagement in metadata creation is not needed at all. But does this abandonment of intellectual engagement actually serve the interests of the users?
There are aspects of metadata creation that can be simplified by technology, and this needs to be emphasized. One can have both utility and “beautiful” metadata. But a mindset seems to have appeared and become stuck on what is probably the most naive assumption: that technological progress is inevitable and we should not stand in its way. This naive assumption is usually accompanied by the notion that the user knows best—that there is an infallibility to the user-centered approach. But these assumptions are often just that—assumptions. They are not based on actual evidence. Should they therefore be allowed to trump all other considerations without question?
The speakers referenced this Library Journal article, “Googlizers vs Resistors” (http://www.libraryjournal.com/article/CA485756.html). In the article, Steven Bell makes the point: “We should just maintain that good enough isn’t acceptable.”
In science, teleological arguments are considered invalid. It’s wrong to think that everything is moving to some final inevitability, as if the future event has already caused the path everyone is on. Yet this sense of inevitability is too pervasive in the thinking about search engines—that search engines will develop to solve all problems. There has been a rush to digitize collections and put them online, with the idea that search engines will inevitably take care of the side issues of metadata creation.
The speakers at this session advocated a pragmatic view of the limits of technology, and this view grew out of their experiences creating metadata in this digital age.
What is being lost in this mad rush for digitization is a sense of who the user is. Whose interests are being served by digitization? The speakers concluded that at times it seems as if the answer is that the user is the computer, not a human being. They likened this to the story of Dr. Frankenstein, who hadn’t thought through what he wanted do with his creation.
We need to spend more time thinking carefully about what the user wants. The user wants to solve problems. To discover new things. To create works of art.
We can’t treat access to resources as a random process. We need to grapple with this question all the time: “What is meaningful access?”
Once we grapple with this question, the answer becomes clearer: accurate, complete metadata is more important than ever.
The speakers referenced the book “Radical Cataloging,” and a point made in an article in the book. We need to reject the idea that users won’t care if junk is found when searching.
The speakers went on to discuss their specific experiences at their institutions. At the University of Alberta, the Education and Research Archive (http://guides.library.ualberta.ca/era), collects, disseminates and preserves the intellectual output of the University of Alberta.
“Dare to deliver” is the motto of the university, but bad metadata can put this reputation at risk. Digital repositories make promises about their mandate, but the mindset against good metadata makes it seem as if the goal is to pull back from that mandate.
Those that push for self-directed users to replace mediated deposit as a cost-saving measure confuse self-discovery with the processes of resource deposit. Farming out subject analysis distorts the basis for the entire profession of librarianship. If the answer to the question “Can you train the administrative assistant to do the metadata entry?” is NO, then we need to get back on track with making some basic distinctions.
The institutional repository is not just a file storage process. Curating resources requires a specific skill set that is distinct from just the technological issues. If there are no subject keywords for thousands of articles, then poor metadata becomes a real bottleneck for productivity. Just entering titles is often not enough, and the speakers provided several examples of how information is lost if we rely too much on the scanty metadata that is present in titles. Titles that are sometimes amusingly misleading and reports numbers entered as the author were but two examples. This kind of metadata ends up creating problems for the institutional repository, when the effort should be on trying to solve problems.
Numerous paradoxes are arising. On the one hand some faculty members create and master their own schemes to organize their output, and so it makes little sense to dismiss metadata creation in general because that would also mean dismissing these efforts. The reason why these sometimes misguided efforts for self-created metadata appear is that they fill a need for faculty members.
On the other hand, many faculty members do not want to be burdened with the process of creating metadata, and when they make the attempt they often do a very poor job. Faculty members often have a clear separation in their minds of their jobs and the job of the librarian—and they don’t want to be librarians.
Even when new technology opens up new levels of functionality, bad metadata can become more embarrassing. For example, faceted browsing can produce dramatic results in revealing errors. Faceted browsing can be helpful in pointing out the problem, but the technology itself doesn’t solve the problem with bad metadata.
MARC records are still considered the gold standard in the metadata world. But so many alternate metadata schemes have appeared that crosswalks are revealing all kinds of quality control issues. The skills to handle these problems are critical to meeting the mandate of the institutional repository.
Federated searching seems like a great idea, but a huge problem is quickly discovered because quality standards vary so greatly. The speakers were quite fed up with the promise of federated searching, and the myth backing it—that technology is supposed to solve all our problems.
People who take up the job of digital project manager need to quickly get in touch with the actual metadata practitioners. This was a good piece of advice from the session for all those who undervalue cataloguers.
Annie Murray, from the Spectrum Research Repository of Concordia University, described the open access policy for the articles produced by the faculty of Concordia University. The Spectrum Research Repository is a self-archiving service, with user-generated metadata.
Even though this open access approach is indexed by Google and Google Scholar, there is not a lot of uptake, even by faculty members who are sympathetic to the goals of open access.
Self-archiving contributions are low because of:
– lack of time
– no incentive
– uncertainty of publisher policies and a desire to avoid the wrath of scholarly publishers who don’t like the competition of open access
– metadata input is considered onerous (even though the form only takes about ten minutes to fill in)
– uncertainty about creating trustworthy metadata
Efforts are being made to reduce the inconveniences for faculty members, but there is resistance since professors want the library to do it all.
The irony (as has been reflected throughout this session) is that there is a misunderstanding and underestimation of what metadata creation is all about. Professors don’t want to do it, but there is pressure, in part because of the myths of technology, to also take metadata creation out of the hands of the librarians. These speakers from these two institutions vigorously wrestled with these misunderstandings.
The speakers revealed that we all need to take a more pragmatic approach to what technology can and cannot do, and to develop a more expansive sense of what users want to do with our metadata. After all, these user needs are really at the core of the mandate of these institutions.