Step One: Staged Transition to Linked Data Native Cataloging

Step One of Phase Two of the transition roadmap is focused on migration of cataloging workflows to a Linked Data native cataloging workbench. By “Linked Data Native” we mean an interface that is designed specifically to interoperate with external Linked Data information resources as an integral part of cataloging workflows and that capitalizes on the extensibility offered by working with graph-based data models. As part of this study, we experimented with several such interfaces.

Figure 12: BIBRAME Scribe cataloging interface

Figure 12 shows a screenshot for a native Linked Data cataloging interface developed by Zepheira, Inc. and modified for testing by the UC Davis BIBFLOW team. Unlike MARC-based cataloging interfaces, the Scribe interface presents the cataloger with a Linked Data oriented view of the Universe. Rather than filling out form fields as appropriate as a means of defining the format, for example, of a particular object, Scribe asks the cataloger to first identify the kind of object being described. Once this has been done, it presents the user with a View based on one of many Linked Data Profiles, Linked Data models appropriate to the type of object being described. Each Profile contains a map of relevant Linked Data lookup services, and the View reflects this by providing type-ahead functionality on appropriate fields. Importantly, Profiles are highly configurable, allowing libraries to record extensible descriptions of objects. For example, one might combine traditional MARC-based content fields with Electronic Archival Description (EAD) descriptors in the same graph, something non-graph based systems cannot accommodate without extensive modification of the application Model itself.

Figure 13: Library of Congress BIBFRAME Editor

The Library of Congress BIBFRAME Editor offers a different approach to a Linked Data native cataloging interface. It focuses on creating Linked Data graphs while maintaining labels that reflect current cataloging rules (ie. RDA). It also builds on the BIBFRAME Work/Instance model.

Both of the above Linked Data cataloging workbenches are standalone products that output Linked Data graphs. Another approach to this transition could be the addition of native Linked Data workbenches to existing ILS. The addition of URI maintenance into current library workflows of several ILS (discussed in Section Three) marks a step in this direction. But adding an extensible interface capable of handling multiple profiles, communicating with a growing collection of Linked Data endpoints, and reflecting the Work/Instance BIBFRAME model will require significant effort on the part of the vendors who supply these ILS.

Linked Data adoption also opens the door to new, more automated modes of cataloging. As part of the BIBFLOW project, we experimented at UC Davis with systems that utilized available link data endpoints to construct catalog graphs on the fly.

Figure 14: Barcode cataloging

Our barcode cataloging system allowed us to extract ISBN information by scanning a book barcode. The ISBN was used to make a series of queries to OCLC and Library of Congress Linked Data endpoints. When needed, a popup screen would ask catalogers to disambiguate information. At the completion of the process the appropriate graph was added to the triplestore. The system increased both efficiency and accuracy of bibliographic and holding data.

Regardless of the approach to native Linked Data cataloging pursued, there will be some constants. First, the transition will come at some cost. Libraries that host and maintain local ILS will be required to migrate the ILS to new, Linked Data native system. Libraries that use cloud-based ILS can similarly expect to pay for migration to new, cloud-based systems, as migrations of this magnitude legitimately constitute a release of a new system.

Regardless of the path to native Linked Data cataloging taken or the form of the data Model employed, new Linked Data workbenches must function in concert with existing MARC based systems. As noted in Section III, ILS and other library systems operate as part of a complex information ecosystem where data is exchanged regularly between systems. It is neither desirable nor likely that all of these systems will convert to a Linked Data exchange model at the same time. As such, libraries should expect to operate in a hybrid ecosystem for some time, where both Linked Data graph and MARC records exist in parallel. Providing this parallelism requires coding efforts that are not incidental. As part of BIBFRAME’s experimental effort, we were able to build bi-directional connectors between BIBFRAME Scribe’s graph database and Kuali-OLE’s relational database. These connectors functioned such that any time a new Linked Data graph was created (whether by human cataloging or batch conversion) a “stub” MARC record was created in OLE, containing all necessary information to perform regular functions such as search, discovery, and circulation. Similarly, whenever a record was created or loaded into OLE, a parallel graph was saved to the graph database. Similar bidirectional functionality was added for edits as well.

While the above described parallel universe appears to create a great deal of unnecessary duplication and redundancy, its benefits outweigh this cost. Implementing a parallel system allows iterative conversion of both workflows and systems. Rather than having to convert all systems and workflows involved in the exchange of MARC records or MARC-based cataloging at one time, individual workflows and systems can be migrated to native Linked Data operation one at a time. At this systems level, this means that the transition can be made over time with a smaller, long-term or permanent staffing impact. This reduces the overall cost of the transition.

Running parallel, synchronized MARC/Graph data stores also increases efficiency and decreases the cost of migrating cataloging workflows from MARC to native Linked Data oriented workflows. With this model, migration can be accomplished by retraining and migrating small groups of staff at a time as opposed to attempting to train all cataloging staff and migrate the entire cataloging effort at one time. This reduces the impact on ongoing work efforts, all of which would be simultaneously affected during a mass transition, effectively shutting down work efforts during the transition. Additionally, managers and trainers will learn from each iteration, improving the efficiency of training and transition with each iteration. Further details of this iterative approach are provided in Section VII: Transitioning Workflows.

<<  Phase Two: Transition to a Native Linked Data Ecosystem Step Two: Batch Conversion of Legacy MARC Records  >>
Return to BIBFLOW Roadmap Table of Contents