Environmental Data Science Lunch - Autumn 2019

October 8

Note: Fernando Pérez (UC-Berkeley, Project Jupyter) will give a talk at the CDAC Distinguished Speaker Series:

Scientific Open Source Software: meat and bits but not papers. Is it real work?

Steven Moen, Statistics, UChicago

October 10, 2019

Monthly temperature readings taken across the United States starting in the early 2000s seem to be more tightly linked to each other than they were in prior decades. To analyze this trend more carefully, we have created a test statistic that allows for inference about whether distinct time series have a simultaneous change point, which for our purposes means a change in the first derivative in two different series at the same point in time. This statistic will allow us to measure the likelihood that two temperature time series are changing at the same time or whether the change is simply caused by random chance. We will apply it to temperature data gathered from the U.S. Historical Climatology Network (USHCN) at 15 locations across the United States and compare our results to other methods for changepoint detection.

Mercedes Pascual, Dept. of Ecology and Evolution, UChicago
October 31, 2019

With examples from my work on malaria in highlands and cities, I illustrate how we can address the role of climate variability and climate change in nonlinear stochastic systems by combining process-based models with computational statistical inference methods for time series.

Rahul Subramanian, Dept. of Ecology and Evolution, UChicago
November 7, 2019

Rahul will discuss recent work predicting the re-emergence of dengue epidemics in Rio de Janeiro, Brazil. He will also provide a brief introduction to the particle filtering methods in the R package POMP, which provides plug-and-play methods and a software framework for fitting and simulating from partially-observed Markov processes, using dengue re-emergence in Rio as a motivating scenario. Rahul will discuss the formulation and fitting of stochastic mathematical models using this framework including the incorporation of process and measurement noise. As time permits, he will provide a broad overview of Sequential Monte Carlo algorithms used to estimate likelihood and iterated filtering algorithms used to estimate parameter values. While the examples shown will have an epidemiological focus, anyone who models stochastic processes characterized by nonlinear dynamics, complex likelihood functions, or substantial measurement error may find the methods relevant for their discipline.

Stewart Edie, Geophysical Sciences, UChicago
November 14, 2019

An organism’s external phenotype is its primary interface with the ambient environment. Consequently, extrinsic pressures from a mixture of biotic and abiotic factors are expected to impose strong selection on that phenotype, but intrinsic factors such as evolutionary history can restrict access to morphologies suited for certain ecologies and life habits. The exoskeletons of the Bivalvia—its shells—are a strong model system in which to quantify this evolutionary tradeoff. We are developing a high-throughput pipeline to capture the 3D shape of these shells, applying deep learning to quantify hard-to-characterize aspects of this group's morphology: surface features ranging from spikes to irregular flanges and periodic frills. Distilling these shapes into lower-dimensional features allows us to more readily interrogate the mixed effects of environment and evolutionary history across an entire, major animal clade.

Robert Jackson, Argonne National Lab, Environmental Computing, Sensing, Data, and GIS Department
November 21, 2019

The analysis of multi-terabyte datasets is extremely important for climate science given the vast temporal and spatial scales that climate data exhibit. This talks shows how distributed data science toolkits such as dask and Ophidia, combined with multidecadal research weather radar data, can provide insights on long term precipitation trends. In addition, I will demonstrate how deep learning toolkits such as TensorFlow on such datasets can provide new methodologies for short term precipitation forecasting.

Elise Jennings, Argonne National Labs
Dec. 5, 2019

From classifying galaxies and finding gravitational waves; to 3D reconstructions of connections in the brain; to discovering new materials or new particles in high energy physics colliders; human-Artificial Intelligence collaborations are transforming the way science is done and accelerating the pace of scientific progress and discovery. Modern predictive models and algorithms require massive datasets running on high-performance computing platforms. We will discuss some key topics in scientific machine learning such as robustness, interpretability of the black box, reproducibility and integrating domain knowledge into machine learning. This presentation will also showcase some of the exciting data science research at Argonne Leadership Computing Facility.