New DCO Publications Browser Live

The DCO Data Science team has been working on a new publication search capability in the DCO Portal. This new interface is now online and accessible on the DCO website here.

The DCO Data Science team has been working on a new publication search capability in the DCO Portal. This new interface is now online and accessible on the DCO website here. It has more facets (including DCO Community, author, and year) to select, more information displayed in the result set, and more features available to enhance your ability to search for and retrieve publications.

 

 

Along the left are expandable facets, which can be used to create a specific search of the database. For example, if you want to search by authors, you can expand the author facet. Or, you can search using keywords by expanding that facet. Just click on the + next to the facet label to make your selections. The search box at the top queries all that information. In another example, if you want to find all the publications related to “molecular biology,” type that in the search box. Or how about all publications authored by Robert Hazen? Just type "Hazen" in the search box.

Why is this new tool so much faster (orders of magnitude faster) than the previous one? The reason lies in the search and retrieve mechanism. Every information storage system has advantages and disadvantages. The DCO Knowledge Store, a directed graph known as a triple store where all DCO metadata are stored, is good at knowledge representation and linked data. The system can handle complex queries for discovering concepts and the relationships among them, retrieving complex information and relationships, and more. The disadvantage of a triple store, in most cases, is that as more and more information is stored in them, the slower text-based searches become. Although the technology has taken great leaps forward in performance and scalability over the last couple years, it is still relatively slow. Triple stores are not good at text-based queries but are great at relationship queries.

In this new publications browser the Data Science Team ingested information from the triple store into an inverted index, using an open source product called ElasticSearch. Using an inverted index approach allows for searching over a great amount of text including keywords, abstracts, descriptions, author names, and more very quickly. It is the nature of inverted indices. In this new publications faceted search, information is displayed much faster than before.

If you have any questions or comments about the new publication search capability, please contact web@deepcarbon.net.

 

Further Reading

arly Extinctions Set the Stage for Life as We Know It
DCO Research Early Extinctions Set the Stage for Life as We Know It

By applying data science techniques to the fossil record, researchers have found evidence for two…

IGSN diamonds
DCO Highlights International Initiative to Make Sample Registration Easy and Open on a Global Scale

A new grant from the Alfred P. Sloan Foundation will support an international initiative, led by…

4D Workshop Report
DCO Highlights 4D Workshop: Deep-time Data Driven Discovery and the Evolution of Earth

The 4D Workshop was convened from June 4-6, 2018 to explore ways to advance our understanding of…

Webinar Wednesday 13 June 2018
DCO Upcoming Events Webinar 13 June 2018: Why and How to Cite Data

Learn how the Data Science team is using new technologies to increase the visibility, validity, and…

Back to top