Will the response of the library profession to the internet be self-immolation?

by Martha M. Yee, with a great deal of help from Michael Gorman

Publicado en AUTOCAT el 24 de julio de 2007, por Marc Truitt (University of Alberta Libraries), en nombre de Martha Yee. Reproducido aquí sin autorización.

Disponible también en:


There are two components of our profession that constitute the sole basis for our standing as a profession. The first is our expertise in imparting literacy to new generations, something we share with the teaching profession. The other is specific to our profession – human intervention for the organization of information, commonly known as cataloging. The greater goals of these kinds of expertise are an educated citizenry, maintenance of the cultural record for future generations, and support of research and scholarship for the greater good of society. If we cease to practice either of these kinds of expertise, we will lose the right to call ourselves a profession.

At the dawn of the modern age of our profession in the 19th century, heads of libraries were involved in cataloging (Antonio Panizzi and Charles Ammi Cutter among them). When the Library of Congress began to distribute catalog cards to libraries in 1901, fewer and fewer librarians learned to catalog. Now most LIS schools teach, at best, an introduction to information organization course in which students talk about such matters as how to organize supermarkets. That is the extent of the exposure of most new librarians to the principles of cataloging. Because so few librarians learn about or practice information organization any more, few librarians are aware of the danger that currently looms over the profession as a whole because influential people at the Library of Congress and our great research libraries want to do away with providing standard catalog records for trade publications to the nation's libraries. All librarians, not just catalogers, should take a look at the Calhoun report (Calhoun, Karen. The Changing Nature of the Catalog and its Integration with Other Discovery Tools ( and follow the progress of the Working Group on the Future of Bibliographic Control ( There you will find the argument that we should cede our information organization responsibilities to the publishing industry and other content providers. All this because some research studies show that undergraduates prefer to use and Google rather than libraries and their catalogs.

These library leaders have forgotten, or never knew, the fact that expertise in organization of information is at the core of the profession of librarianship. Because of their blindness to the nature of our profession, we are now in danger of losing not just standardized cataloging records and the Library of Congress Subject Headings, but the profession itself.

The excuse used, the preference on the part of undergraduates for quick answers, is nothing new. Undergraduates have always tended to over-use ready reference sources until they are taught by both librarians and professors how to do effective research and critical thinking. What has changed, apparently, is the willingness of these library administrators to shoulder the responsibility of teaching information literacy, research skills and critical thinking skills. I haven't heard anyone in the teaching profession argue yet that we should let recalcitrant elementary school students decide for themselves not to learn to read or do math, but perhaps that is next.

The implication in the Calhoun report that Google and are comparable to a library catalog and that libraries are in competition with Google and, are dangerous falsehoods. Google and are commercial entities. Their goal is not an educated citizenry, or maintenance of the cultural record for future generations, or support of research and scholarship for the greater good of society. Their goal is instead to get as much money as possible out of our pockets and into theirs, and to spend as little as possible on labor while making as much as possible in profit. Someday it is conceivable that their goal could evolve into that of quelling social unrest by limiting access to certain kinds of information.

Google and limit human intervention for information organization as much as possible in order to maximize profits. Computers are dumb machines. They cannot reason or make connections that a 2-year-old could make. The only logic available to a computer is based on either word counting or counting the number of times users gain access to a particular URL, the bases for their allegedly sophisticated search and display algorithms. A computer cannot discover broader and narrower term relationships, part-whole relationships, work-edition relationships, variant term or name relationships (the synonym or variant name or title problem), or the homonym problem in which the same string of letters means different concepts or refers to different authors or different works. In other words, a computer, by itself, cannot carry out the functions of a catalog.

I used to check to find a novel by Fannie Hurst called Lummox. They listed it as being in print and for sale for about $5.00. I ordered it, but when it arrived several weeks later it turned out to be a play adapted from Hurst’s novel by someone else; none of this appeared in the description.

Thomas Mann, the great reference librarian, has written a wonderful book published by Oxford University Press that introduces scholars and researchers to LCSH and the LC classification so that they can do more effective and efficient research in libraries. He tells the story of searching for his book in and being told “Readers interested in this book were also interested in Thus spake Zarathustra and Death in Venice.”

When you search Google using Twain and Sawyer you get completely different results from what you get from a search using Clemens and adventures of Tom. The displays do not differentiate among the work, Adventures of Tom Sawyer, and works about it and works related to it.

When you search Google for power, Google does not ask you if you are interested in electrical power or in political power. When you search Google for cancer, you get 224 million hits. Even Google seems to realize that that is less than helpful; at the top of the screen it suggests that you refine your results by choosing Treatment, Symptoms, Tests, diagnosis, etc. When I investigated to see where this refinement of results came from, it turned out that Google had asked for unpaid volunteers to break down large result sets such as this one.

It has become fashionable to criticize catalogs for not providing users with the evaluative information they desire, a la Those who criticize seem unaware that catalogs currently do provide evaluative information, in that the presence of a work in the collection of a major research library implies (with some caveats) that that work was deemed of scholarly value. Catalogs can also help users identify the major authors in a field; if a user does a subject or classification search, and notices that half the books listed under a particular subject or in a particular discipline are by the same author, that is a good clue that that author may be a major author in that field. All of this happens only when humans intervene in order to organize information; it doesn't happen in or Google.

I once went to a talk by a colleague who was working in the business world on an information portal. He indicated that the project had begun as an automatic indexing project with relevance ranking, but that the people paying for the work were so dissatisfied with the results that the project had morphed into a thesaurus development project employing human indexers. Is this a vision of the future? Information organization only for those who pay for it and Google for the rest, instead of information organization for all as a social good paid for with tax dollars?

It is a fact universally acknowledged that librarianship is a woman-dominated profession. As such, ours is a deferential culture that avoids conflict and encourages humility, otherwise known as low self-worth. After all, what we do is perceived of by society at large as women's work, that is, work that anyone can do and that does not require any particular expertise (see Roma Harris. Librarianship: the Erosion of a Woman's Profession. 1992). The fact that Google and expect unpaid volunteers to do the work we do is evidence of this. Jeffrey Toobin's article on Google in the New Yorker (Feb. 5, 2007) casually and uncritically cedes to Google its claim to be the world's expert in information organization and is striking evidence of the ignorance of non-librarians about our work. Is it too much to ask for our colleagues in the profession, at least, to understand and acknowledge the value of human intervention for information organization, expensive though it is? Surely the richest country in the world can afford to pay for the human labor required to keep its cultural record in good order for future generations. The cost is peanuts compared to that of a missile defense system, and it would provide a much more effective defense for our way of life.

Many members of our profession, including catalogers, believe that information seekers prefer keyword access and that, for that reason, and Google are better designed than library catalogs. The reason catalog users seem to prefer keyword access is that system designers make keyword access the default search on the initial screen of nearly every OPAC in existence. It should be no surprise that transaction log studies then show that users do more keyword searches. The entities users seek when doing a catalog search (works, authors, and subjects) are actually much better represented by headings than by keywords. Keywords do not link synonyms (hypnosis vs. hypnotism) or variant names (Mark Twain vs. Samuel Clemens); keywords do not differentiate homonyms (electrical power vs. political power) or two different people of the same name (Bush, George, 1924- vs. Bush, George W. (George Walker), 1946-); keywords do not precoordinate complex concepts to indicate their relationships (e.g., Women in television broadcasting), and keywords do not suggest broader, narrower or related terms. However, “browse” searches with heading displays, which do all these things, are buried by system designers on advanced search screens, and put into indexes in which users are required to know the order of terms in a particular heading in order to find what they seek. The point I'm making here is that another major threat to our profession is posed by system designers who don't understand catalog records or catalog users. For the first time this year, our Voyager software has finally allowed us to provide users with a keyword in heading search of subject headings and cross references which responds with a display of matching headings and cross references, not an immediate display of bibliographic records. You can try it out at Try a topic/genre/form search on women or a topic/genre/form search on Poland, to see how useful it can be to let users see headings and cross references in response to a keyword search. When the same keyword in heading searching is applied to headings that identify works, users can search on both the author's name and the title and retrieve a sought work even when using a variant of the title (an ability denied to them in most current systems). Try a preexisting works search on Shakespeare in our file to see what I am talking about.

Close reading of catalog use research shows that users' searches almost always match LCSH headings, as long as the system provides access to the LCSH cross reference structure, as long as the system doesn't require users to know entry terms, and as long as the user knows how to type and spell (see: Yee, Martha M. and Sara Shatford Layne. Improving Online Public Access Catalogs. Chicago: American Library Association, 1998. p. 133-134). The many people who say otherwise in the literature participate in the wide-spread anti-intellectualism characteristic of our society, since they don't read critically the research in their own field. Problems with typing and spelling are by far the most common cause of search failure; sadly, at a time when spelling is more important than ever before for success in keyword searching over the Internet, it seems to be becoming a lost art. Typing is a big problem for older library users who grew up when typing was taught only to those intending to be secretaries.

To sum up, the threats to our profession are not from the Internet per se, which is just another tool we can use to do our jobs better, if we use it sensibly. The real threats are posed by the large number of our fellow librarians, including prominent leaders in the profession, who do not grasp the nature of our profession and the fact that human intervention for information organization is at its core; the low self-image those librarians have; and the failure of online catalog designers to learn about the nature of catalog records and the nature of catalog users so as to design systems that allow users to search for the entities they seek (works, authors, and subjects), which are represented in catalogs by headings, not by keywords.

Even if you disagree with, do not understand, or are not convinced by these arguments about the value of human intervention for information organization as currently practiced by the last of the catalogers in our profession, think about the larger implications of leaving information organization in the hands of the commercial interests that control content in our society. Up until now, libraries have played the role of intermediary between commercial interests and society in the provision of information as a social good and as part of the intellectual commons; we have worked hard to ensure that people have access to the information they need regardless of their socio-economic level, because we recognize that democracy does not work when the electorate is unable to determine the facts or to hear the arguments on both sides of an issue, and because we recognize that research and scholarship that advance our society are not carried out only by the wealthy who can afford to purchase all of the materials they need to do research. Leaving information organization in the hands of commercial interests such as Google and would be the first step in the process of removing the library and the library profession from the information provision chain altogether. Publishers already have the ability to sell information directly to the consumer on a pay-per-view basis. If we move toward a society in which that is the only way users can get information, we will have a society that replicates in the information sphere our current huge economic gap between haves and have-nots, and that places all the power to control the availability of information in the hands of entities that are completely profit-driven and have no incentive to serve the greater good of society as a whole. Do we really want to follow our leaders down this path?

