Deep Carbon, Big Data: Meet the DCO Data Science Team

Finding hidden patterns in big data sets is an intriguing challenge, one that is opening doors to all sorts of new research questions—and answers—in a variety of scientific disciplines. The Deep Carbon Observatory recognized this early on, and devoted resources to a team of dedicated data scientists at Rensselaer Polytechnic Institute (RPI) in Troy, New York.

Tackling everything from large mineral databases, to catalogues of data trapped in archived documents, the Data Science Team is exploring Earth in novel, predictive ways for the DCO Science Network.

The Team

Peter Fox, professor of Earth and Environmental Science, Computer Science and Cognitive Science at RPI leads the DCO Data Science Team, along  with Karyn Rogers (assistant professor, Earth and Environmental Sciences, RPI) and Kathy Fontaine (senior research scientist, adjunct professor, Tetherless World Constellation, RPI). Patrick West (principal software engineer) and Ahmed Eleish (graduate research assistant), round out the primary Data Science Team. In addition, a group of dedicated and creative experts and graduate students help from time to time. Together, the team supports the DCO Science Network in a variety of ways, helping conduct primary research in deep carbon data science.

Data Science Team 2017

Pictured from top left: Peter, Karyn, Kathy, Ahmed, Fiona, Anirudh, Corey, Feifei, Hao, Sophie


Ongoing Projects Demonstrate Endless Possibilities

The Data Science Team is currently involved in several projects in the DCO portfolio.

Mineral Evolution over Deep Time

Fox and Eleish are collaborating with Robert Hazen and Shaunna Morrison (both of Carnegie Institution of Science, USA) and Daniel Hummer (University of Southern Illinois, USA) to bring novel network analysis to the field of mineralogy. This effort has resulted in a series of high-profile publications, and in many ways, has transformed mineralogy from an observational to a predictive science. Read more about their work identifying a mineralogical signature of the Anthropocene Epoch

Thermodynamics of Chemicals and Minerals

Ever wonder where that heat capacity input data for your thermodynamic modeling calculation came from? So did Mark Ghiorso (OFM Research, USA) of the Extreme Physics and Chemistry Community and the DCO Data Science team. Over the past few years, they have launched a systematic effort to rescue a significant amount of published thermodynamic data from tables and figures in published literature. These data were published via the DCO Data Portal and are available for community access. These data will also become available in the DCO Jupyter Notebooks.

Publishing and Mining Data

Many DCO Synthesis activities are currently underway, with a view to sharing DCO’s accomplishments with a variety of audiences in 2019. Since 2013, one of the Deep Energy Communities’ efforts has been on global characterization of Noble Gas Isotopes thanks to the work of Igor Tolstikhin (Kola Science Center of Russian Academy of Sciences (Russia) and colleagues ( The datasets were first published in 2013 and a second version suitable for global analyses were published in 2015 (

Integrating Related Projects

A closely related project to DCO Data Science is the NSF-funded ENabling Knowledge Integration (ENKI) for geochemical, thermodynamic, and geodynamic models, led by Mark Ghiorso (OFM Research). ENKI utilizes the Jupyter Notebook environment for integrated modeling applications. This approach is a useful model for DCO initiatives because it allows for collaboration and sharing of data. Users know how the models were developed, what data were used, intermediate results, and visualizations of output – all with the goal of reproducible science.

DCO Computer Cluster Available to DCO Network

Making high-end computational services available to DCO collaborators was identified as a critical issue for addressing DCO’s Decadal Goals. Soon after the DCO was launched, a clear consensus emerged that DCO should have its own computation center with a dedicated cluster that would enable it to organize and prioritize computational runs for DCO needs, without the inconveniences of using existing services. From chemical and physical modeling to genomic analyses, the DCO Computer Cluster can run numerous software packages and scientific programs for theoretical calculations of C-bearing phase structures and properties, geodynamics calculations, thermochemical modeling, and other computations.

Access to the cluster is available to all DCO researchers.  To request time on the cluster, visit, and email Peter Fox ( your application with copy to EPC chairs Craig Manning (, and Wendy Mao ( 

Join the Data Science Team for a Webinar

On 17 May 2017, the Data Science Team will present the first in a new series of DCO webinars focusing on big data modeling and visualization. Called “DCO Webinar Wednesdays,” this webinar series builds on the successful workshop program at the Third DCO International Science Meeting and will take place monthly over the summer. This inaugural webinar, “Making Sense of Jupyter Notebooks”, will help participants use this novel data science tool.  Add date and time and connection link. 

Further Reading

DCO Highlights 4D Collaboration Brings New Dimensions to Earth Sciences

The Deep Time Data-Driven Discovery group is a coalition of researchers seeking to answer questions…

DCO Research New Mineral Classification System Captures Earth’s Complex Past

Could a new classification system that accounts for minerals’ distinct journeys help us better…

arly Extinctions Set the Stage for Life as We Know It
DCO Research Early Extinctions Set the Stage for Life as We Know It

By applying data science techniques to the fossil record, researchers have found evidence for two…

DCO Highlights Two New Sloan Foundation Grants for Deep Carbon Science

The Alfred P. Sloan Foundation is supporting two new deep carbon science projects. “Carbon Down…

Back to top