Webinar Wednesday 2018 Data Series Summer Series




Webinar Archives


Data Science for Geosciences: Data Processing

Fang Huang, Rensselaer Polytechnic Institute, USA


It is often said that 80% of data analysis time is spent cleaning and preparing the data. Moreover, data cleaning is not a one-time job – it is an ever-present need while performing data analysis. In this webinar, Fang focuses on data processing. He starts by introducing rules that define a tidy dataset. Bearing these rules in mind, he shows how to use relatively simple python code to deal with geoscience data with some visualization. The last part of the webinar will highlight an ongoing project on methane experiments. The webinar should be of interest to any researchers working on data science-related projects.


  1. Watch previous DCO webinar videos (posted below on this page) for background information
    Introduction to Jupyter Notebook (Feifei Pan)
    Visual Tools for Big Data Network Analysis (Ahmed Eleish and Shaunna Morrison)
    Data Science for Geosciences: Data Acquisition (Hao Zhong)
  2. DCO Jupyter Notebook login page (registration required) 
    All slides in this talk are directly transformed from a Jupyter notebook file using the notebook extension called RISE
  3. Beautiful Soup package offers support for scraping information from webpages 
  4. Introduction to the Pandas plot function


Why and How To Cite Data

Mark Parsons, Rensselaer Polytechnic Institute, USA


Increasingly data, software, and other research artifacts are being recognized as first class scientific objects, crucial to supporting the arguments in an article as well as general transparency and reproducibility. This webinar will review the latest data citation technologies and how they are being implemented in the DCO Data Portal. It should be of interest to repository managers, publishers, and researchers sharing data.


Data Science for Geosciences: Data Acquisition

Hao Zhong, Rensselaer Polytechnic Institute (USA)


Hao Zhong discusses general data acquisition in geoscience, featuring recent examples of legacy data rescue and management by the Data Science Team.


Wikipedia in Higher Education

Samantha Weald, Wikipedia Education (USA)

During this webinar, we discuss content gaps on Wikipedia, and encourage scientists to help close these gaps, making information more accessible, accurate, and comprehensible to the public.


A Blueprint for a Box Model

Louise Kellogg, University of California Davis (USA)

Modeling and visualization expert Louise Kellogg presents a blueprint and virtual “construction manual” for integrating different types of data into a box model.


Sample Registration Made Easy

Kerstin Lehnert and Megan Carter, Lamont-Doherty Earth Observatory (USA)

Kerstin Lehnert and Megan Carter of the Lamont-Doherty Earth Observatory at Columbia University share the process of sample registration, guiding you in the first step of what is envisioned as a global online catalog of all physical samples collected in the Earth Sciences.


Studying Deep Earth Reactive Transport Using ENKI: A Modeling Primer

Mark Ghiorso, OFM Research Inc (USA)

In this DCO Webinar you will learn how to use Enabling Knowledge Integration (ENKI) tools for modeling deep Earth fluids, chemical reactions, and transport.


Doing Data Science in Jupyter Notebooks - Volcanos and Visualizations

Feifei Pan, Rennselear Polytechnic Institute (USA)

Feifei Pan demonstrates how to use Jupyter Notebooks as a visualization tool using data compiled by the Global Volcanism Program.


Visual Tools for Big Data Network Analysis

Shaunna Morrison, Carnegie Institution for Science (USA)
Ahmed Eleish, Rennselear Polytechnic Institute (USA)

Discover how to turn large data sets into dynamic visualizations that show network connections.



Top Image: Depiction of Ninurta (used as a proxy for Enki), by Austen Henry Layard from 'Monuments of Nineveh, Second Series' plate 19/83, London, J. Murray, 1853.

Back to top