VI. Phase Two: Transition to a Native Linked Data Ecosystem

For the purposes of this report, a “Native Linked Data Ecosystem” is defined as one which exchanges data with other institutions as serialized n-triples (the most familiar form of which being RDF) and offers a Linked Data connected, oriented, and extensible cataloging workbench. While it is recommended, note that under this definition, it is not necessary that a system’s underlying data store be triples based. Contrary to popular belief, few ILS currently implement a truly MARC-based data store. User interfaces to the data are MARC oriented, but the data structures themselves are not. Well-designed software systems are comprised of three distinct components, or layers: 1) Data Layer, known as the Model; 2) User Interface Layers (human and machine), known as the View; and 3) Transaction Processing Layer, known as the Controller:

Figure 9: Model View Controller (MVC) architecture

The View is the on-screen (GUI or command line) interface through which human and machine users interact with the rest of the application. This includes display screens, forms, APIs, etc. If you are reading this document electronically, the window in which you currently see this text is a component of the View. The Controller includes any components of the code that perform operations on data available to the application. In a PDF viewer, for example, this includes reading the raw data in the file and transforming it to a form that can be rendered by the View. Another example would be a program that calculates the mean of a series of numbers or converts a string to lower case. The actual computing process that performs these actions are part of the Controller. Last but not least, the Model is the data structure that an application uses to store data. A Model could be a collection of .CSV or XML files, a relational database, a graph database, or any other data storage schema.

Because the Views employed by current ILS systems are MARC oriented, the library community tends to think that ILS data Model is also MARC based. This is rarely the case. No widely implemented ILS (or sub components for modular environments) is MARC based at the Data Layer. Most current systems store data in relational databases or other indexed document stores that bear only a passing resemblance to MARC itself. For example, the Kuali-OLE data store is comprised of 10,644 fields in 1,499 related tables—far greater than the number of fields and subfields in the MARC specification. Similarly, MARC manipulation tools like MarcEdit rely on a SQL Data Layer to perform much of their work.

Figure 10: Portion of OLE SQL database structure

Simply put, there is little direct relationship between the data Model and application View of most ILS. As such, it would be possible to implement a Linked Data graph Model without changing MARC-oriented Views at all. Similarly, it is possible to change Views to reflect a Linked Data, graph-based orientation to data creation and management while still using a relational database as the applications data Model. There are very good reasons why converting the application to a graph Model is preferable for operating in a Linked Data environment, but these reasons are largely technical in nature and beyond the scope of this report. What is important for the current purpose is recognizing that transitioning to a graph-based data Model is not a pre-requisite to operation in a fully Linked Data ecosystem.

Libraries must complete Phase One of the transition roadmap before commencing Phase Two, which consists of the following steps:

Figure 11: Transition Phase Two

<<  Phase One Completion Step One: Staged Transition to Linked Data Native Cataloging  >>
Return to BIBFLOW Roadmap Table of Contents