The DCO Data Science team has been working on a new publication search capability in the DCO Portal. This new interface is now online and accessible on the DCO website here. It has more facets (including DCO Community, author, and year) to select, more information displayed in the result set, and more features available to enhance your ability to search for and retrieve publications.
Along the left are expandable facets, which can be used to create a specific search of the database. For example, if you want to search by authors, you can expand the author facet. Or, you can search using keywords by expanding that facet. Just click on the + next to the facet label to make your selections. The search box at the top queries all that information. In another example, if you want to find all the publications related to “molecular biology,” type that in the search box. Or how about all publications authored by Robert Hazen? Just type "Hazen" in the search box.
Why is this new tool so much faster (orders of magnitude faster) than the previous one? The reason lies in the search and retrieve mechanism. Every information storage system has advantages and disadvantages. The DCO Knowledge Store, a directed graph known as a triple store where all DCO metadata are stored, is good at knowledge representation and linked data. The system can handle complex queries for discovering concepts and the relationships among them, retrieving complex information and relationships, and more. The disadvantage of a triple store, in most cases, is that as more and more information is stored in them, the slower text-based searches become. Although the technology has taken great leaps forward in performance and scalability over the last couple years, it is still relatively slow. Triple stores are not good at text-based queries but are great at relationship queries.
In this new publications browser the Data Science Team ingested information from the triple store into an inverted index, using an open source product called ElasticSearch. Using an inverted index approach allows for searching over a great amount of text including keywords, abstracts, descriptions, author names, and more very quickly. It is the nature of inverted indices. In this new publications faceted search, information is displayed much faster than before.
If you have any questions or comments about the new publication search capability, please contact firstname.lastname@example.org.