X. Discovery

Discovery is the aspect of Linked Data implementation that is least studied, tested, and understood, as well as being the aspect most likely to have the biggest impact on library operations. It offers the potential for radical changes in the way users search and browse for library information. One of the most obvious and talked about aspects of Linked Data adoption is the extent to which it positions library information to be accessible through search engines and connected with a graph of information beyond the library.

In his February 2015 presentation “Making Library Collections Discoverable on the Web” as part of the OCLC Collective Insight Series titled, “Linked Data [R]evolution: Applying Linked Data Concepts,” Ted Fons outlines the way Linked Data enabled library records can and should circulate in the wider information universe of the World Wide Web. According to Fons, as libraries increasingly shift focus from physical collections management to information access management, the need to make information discoverable through user-preferred mechanisms increases. This requires structuring information such that it can be located through non-library interfaces such as major search engines.

Search engine optimization is a valuable benefit to Linked Data adoption; but it is not the only, or even most important discovery advance that it serves. Linked Data makes possible new visual discovery interfaces that speak to one of the most voiced laments about the transition from stacks to screens: the serendipity of browsing. Consider the rudimentary network visualization of the Lord of the Rings presented in Figure 1 at the beginning of this report. Here we see a focus on a text of interest spread to include an expansive multitude of context, much of which will inevitably be unknown to the user. This type of interface allows users to follow threads of relationship in a manner that harkens back to the days browsing the stacks, moving from one node to the next, with the option of focusing in on a new node and following the subsequent traces growing from it.

The above is just one example of a potential new discovery mechanisms made possible through Linked Data adoption. Current Linked Data discovery efforts either present experimental, pilot demonstrations of this potential (such as the thin network graph presented in Figure 1) or provide traditional search and discovery interfaces. One of the most adopted platforms for exposing Linked Data graphs for search and discovery is Blacklight.

Figure 32: Blacklight demonstration screenshot

Figure 32 above is a screenshot of the online demonstration of Blacklight. Blacklight is, “A multi-institutional open-source collaboration building a better discovery platform framework.” It provides a traditional but sophisticated search and discovery interface to both MARC and Linked Data data-stores. Built on an Apache SOLR/Lucene index, it provides fuzzy search, with full text search capability and faceted browsing.

Another Linked Data discovery platform is Collex, a native Linked Data platform maintained by the Advanced Research Council and Institute for Digital Humanities, Media, and Culture at the IDHMC.

Figure 33: Collex Linked Data browser

Figure 33 presents a screenshot or the Collex platform as implemented at the Michigan State University Library’s Studies in Radicalism Online. Much like Blacklight, Collex provides a fairly traditional library interface to its Linked Data data-store. But it also includes several features designed to capitalize on the extended web of Linked Data information. Note the “Currently Searching…” at the top of right side menu column of the screenshot in Figure 33. Built into the Collex platform is the ability to direct the platform to either query an aggregated triplestore or query a configured list of SOLR endpoints at other institutions, thereby producing an aggregated search and browse environment. The aggregating through linking functionality of the platform represents a significant step towards the type of network expanded search and discovery made possible by Linked Data.

As noted at the beginning of this section, we are only beginning to explore the potential of Linked Data discovery, and experimentation along these lines is likely to continue for the next several years. Two important initiatives devoted to this area of research are the Mellon funded Linked Data for Libraries (LD4L) and Linked Data for Production (LD4P) initiatives. These ongoing initiatives bring together Columbia, Cornell, Harvard, Library of Congress, Princeton, and Stanford University in a combined effort to examine the potential for Linked Data production and discovery, building on products such as Hydra, Blacklight, Fedora, Vivo, and Vitro. LD4L and LD4P has initiated wide engagement with the library community, and are actively developing new search and discovery methodologies and platforms based on their own research and engagement with other libraries.

It is difficult to predict exactly what new search and discovery approaches and capabilities will be developed out of initiatives like LD4L and LD4P. However, we already have enough examples of novel interfaces to begin to see some of the possibilities. Importantly, we currently lack sufficient library Linked Data data-stores to properly test and develop scalable Linked Data search and discovery platforms. We can, however, reasonably expect the functionality of existing systems, which already provide capabilities on par with current library search and discovery, will expand over time as Linked Data uptake in the library community expands.

<<  Vendor Engagement Survey of Current Library Linked Data Implementation  >>
Return to BIBFLOW Roadmap Table of Contents