Data Science Initiatives
Big data is helping researchers unlock the secrets of how Earth works. DCO's Data Science Team is providing the keys to these secrets by working in collaboration with scientists to uncover the interactions, synergies, and dependencies of the total planetary carbon cycle.
DCO data science combines informatics, data management, library science, network science, computer science, and domain science. This all encompassing approach enables the analysis and analytics of all aspects of “data” generated or acquired by DCO researchers. Combining state-of-the-art analytics software, cyberinfrastructure, and information technologies, the data science team is helping to amplify the work of DCO researchers, and consequently advancing understanding of the quantities, movements, forms, and origins of carbon inside Earth.
Projects & Initiatives
Mineral Evolution over Deep Time
Data Science Team members Peter Fox and Ahmed Eleish (Rennselear Polytechnic Institute, USA) are collaborating with Robert Hazen and Shaunna Morrison (both of Carnegie Institution of Science, USA) and Daniel Hummer (University of Southern Illinois, USA) to bring novel network analysis to the field of mineralogy. This effort has resulted in a series of high-profile publications, and in many ways, has transformed mineralogy from an observational to a predictive science. Read more about their work identifying a mineralogical signature of the Anthropocene Epoch. Watch a webinar about how big data is being applied to advance understanding of network connections.
ENabling Knowledge Integration (ENKI) is a collaborative, web-based model-configuration and testing portal that provides tools in computational thermodynamics and fluid dynamics. Data Science Team Leader Peter Fox in collaboration with Mark Ghiorso (OFM Research) are advancing the work of this project launched in fall 2016 with support from the National Science Foundation. To learn more about ENKI, watch this webinar.
Publishing and Mining Data
Since 2013, the Deep Energy Community has been working on the global characterization of Noble Gas Isotopes. Much progress has been made thanks to Igor Tolstikhin (Kola Science Center of Russian Academy of Sciences (Russia) and colleagues, who have been compiling datasets suitable for global analyses. The datasets were first published in 2013, with a second version published in 2015.
Ever wonder where that heat capacity input data for your thermodynamic modeling calculation came from? Mark Ghiorso (OFM Research, USA) of the Extreme Physics and Chemistry Community and the DCO Data Science team did. His curiosity prompted Ghiorso to work with the Data Science Team to launch a systematic effort to rescue a significant amount of published thermodynamic data from tables and figures in published literature. These data were published via the DCO Data Portal and are available for community access. These data also are soon to become available in Jupyter Notebooks.
DCO Knowledge Graph
The Data Science Team at Rennselear Polytechnic Institute has laid the groundwork for a research-focused discovery tool that enables users to visualize interconnectedness between objects across the DCO Science Network. Information on people, departments, institutions, datasets, grants, research, and publications can be browsed, searched, and visualized via the DCO Data Portal.
Jupyter notebooks are a powerful, open source software that allows one to do data science in a single location. Within a typical notebook, a user can import a data set, do statistical modeling, enter code, enter text, and perform any number of other numerical functions, in a variety of languages.The Data Science team has created a Jupyter notebooks hub specifically for DCO network members. Watch this webinar to learn you can use Jupyter notebooks to manipulate and visualize your data.
High-end computational services are readily available to DCO collaborators. DCO has its own computation center with a dedicated cluster that enable it to organize and prioritize computational runs for DCO needs, without the inconveniences of using existing services. From chemical and physical modeling to genomic analyses, the DCO Computer Cluster can run numerous software packages and scientific programs for theoretical calculations of C-bearing phase structures and properties, geodynamics calculations, thermochemical modeling, and other computations. To request time on the cluster, visit here.
The Data Science Team has contributed to the progress of Cornell University's VIVO project, which serves as the skeleton for the Deep Carbon Observatory Data Portal. Several customizations were developed in conjunction with the work, including a custom Sparql module for Drupal, Shibboleth integration, and significant work on the VIVO application itself. Visit Tetherless World's Github page.
Pictured from top left: Peter Fox, Karyn Rogers, Kathy Fontaine, Ahmed Eleish, Fiona Murphy, Anirudh Prabhu, Congrui (Corey) Li, Feifei Pan, Hao Zhong, and former student Sophie Weinell.
If you have a question or project that you would like to discuss with the Data Science Team, please contact Kathy Fontaine firstname.lastname@example.org.
Pan D and Galli G (2016) The fate of carbon dioxide in water-rich fluids under extreme conditions. Science Advances 2:e1601278
Patankar S, Gautam S, Rother G, Podlesnyak A, Ehlers G, Liu T, Cole DR, Tomasko DL (2016) Role of Confinement on Adsorption and Dynamics of Ethane and an Ethane–CO2 Mixture in Mesoporous CPG Silica. The Journal of Physical Chemistry C 120(9):4843-4853
Gautam S, Tingting L, Patankar S, Tomasko D, Cole D (2016) Location dependent orientational structure and dynamics of ethane in ZSM5. Chemical Physics Letters 648:130-136>
Boulard E, Pan D, Galli G, Liu Z, Mao W (2015) Tetrahedrally coordinated carbonates in Earth’s lower mantle. Nature Communications 6(6311)
Gautam S, Cole DR (2015) Molecular dynamics simulation study of meso-confined propane in TiO2. Chemical Physics 458:68-76
Pan D, Wan Q, Galli G (2014) The refractive index and electronic gap of water and ice increase with increasing pressure. Nature Communications 5(3919)
Gautam S, Liu T, Rother G, Jalarvo N, Mamontov E, Welch S, Cole D (2014) Effect of temperature and pressure on the dynamics of nanoconfined propane. AIP Conference Proceedings 1591:1353-1355