Environmental Data Science Lunch - 2022

Stuart Lynn (Research Director and Senior Principal Software Engineer at the Center for Spatial Data Science) and Dylan Halpern (Senior Software Engineer, US Covid Atlas at the Center for Spatial Data Science)
February 17, 2022

Geospatial data and analysis is more central than ever to data science, research, and policy analyses. This is especially evident in the explosion of tools, both open source and proprietary that have been developed over the past 5 years to help users manage and gather insights from their data. However many of these powerful tools, like geopandas (analysis and modeling) and deck.gl (visualization)— are technically inaccessible to analysts and researchers without the available time or skills for advanced coding. A number of commercial ventures (Carto, ESRI etc) attempt to overcome this limitation by bringing these tools together as part of polished, graphical user interface driven platforms. While these platforms offer ease of use, they raise concerns about longevity, data ownership, and academic support.

Matico is a new free and open-source platform we are developing at UChicago's Center for Spatial Data Science that seeks to fill the gap between open but technically focused tools and commercial platforms. Consisting of a suite of interoperable components, Matico enables organizations and individuals to manage and visualize their geospatial data while easily maintaining their own infrastructure. A backend server allows users to easily load, clean, analyze, and distribute data through APIs, queries, and in-browser data editing tools while a powerful app builder allows users to develop their own rich applications that target diverse audiences.

This talk will demonstrate the current development build of Matico, our future roadmap , and demonstrate relevant use cases. Matico is now and will forever be open through a permissive MIT open-source license. Learn more at https://matico.app/

We will be running the seminar in a hybrid format. Please fill out this form if you are planning to join in person – lunch will be provided.

Ryan Kellogg (Professor and Deputy Dean for Academic Programs, Harris School of Public Policy, UChicago)
February 24, 2022

We examine the extent to which traditional economic efficiency arguments that favor carbon pricing over a clean energy standard or subsidies for zero-emission electricity generation are weakened when one considers: (1) policies that are sufficiently stringent that nearly complete decarbonization is achieved; and (2) the fact that there already exist large markups of retail electricity prices over private marginal cost. The project is joint with Severin Borenstein at UC Berkeley.

Tiffany Shaw (Associate Professor, Geophysical Sciences Department, UChicago)
March 3, 2022

Understanding and preparing for the impacts of anthropogenic climate change are some of the most urgent scientific challenges of our time. Remarkable progress has been made in the last half century in our ability to simulate Earth’s climate using state-of-the-art models and project future climate changes under specific emission scenarios. However, simulation does not equal understanding. Confidence in climate model projections requires an understanding of the mechanisms underlying the modeled responses, which helps to ensure the responses are not the result of structural and parameter uncertainty. In this talk I will discuss progress toward closing the gap between simulating and understanding climate change. I will present examples from the pioneering work of Nobel Prize winning scientist Dr. Syukuro Manabe and from our more recent work on the response of the atmospheric circulation to climate change.

Ian Foster (Arthur Holly Compton Distinguished Service Professor, Department of Computer Science / Division Director, Distinguished Fellow, and Senior Scientist, Argonne)
March 10, 2022

I review several areas in which big data, big computing, and AI are being applied to problems in environmental data science, with a particular focus on activities at Argonne National Laboratory that may present opportunities for collaboration.

Tim Wootton (Professor, Department of Ecology and Evolution, UChicago)
March 17, 2022

Understanding and predicting responses to environmental impacts in ecosystems is challenging, requiring an integration of modeling and data to account for multiple concurrently acting processes. While ecological modeling typically describes ecological dynamics, empirical data collection traditionally takes a static snapshot approach. To explore the uses of ecological dynamics data, I have spent the past 29 years at the U of C developing a comprehensive time series of a the rocky intertidal community of Tatoosh Island, Washington, a model ecosystem for ecology that allows for experimental manipulation and direct observation of interactions among species. I will provide an overview of this work illustrating how I have applied and tested different ecological modeling approaches in this system, which has yielded insights into effects of extinction, ecological disturbance, landscape pattern formation and the implications of global change, especially ocean acidification, on ecological communities, and discuss some future directions to take advantage of ecological dynamics data.

Nico Marchio (Research Data Scientist, Mansueto Institute)
April 7, 2022

According to the UN, approximately one billion people currently live in 'slums' or informal settlements, which typically lack access to public services such as clean water, toilets, sanitation, drainage, waste collection, and safe shelter. At the Mansueto Institute, researchers have developed computational methods to automate the detection of these critically under-serviced communities and estimate their population down to the street block. In this presentation Nico Marchio reviews these efforts in the context of Sub-Saharan Africa and how geospatial computing techniques and large scale GIS vector data were used to reveal new insights into the scale of informality. Nico is a staff Data Scientist at the Mansueto Institute for Urban Innovation.

Rahul Subramanian (Data Science Fellow, NIH)
April 14, 2022

I will be discussing my experience during the first year of a data science fellowship at NIAID (National Institute of Allergy and Infectious Diseases). I will touch on some of the broader goals of the fellowship, the kinds of projects that fellows work on, what the day-to-day experience is like, as well as some of the different divisions/roles/offices in NIAID and how they use data science.

Maureen Coleman (Associate Professor, Geophysical Sciences Department, UChicago)
April 28, 2022

The Laurentian Great Lakes hold 20% of Earth's surface freshwater and provide essential ecosystem services. As an interconnected waterway that spans strong environmental gradients, the Great Lakes also represent a unique natural laboratory for understanding how physical, chemical, and biological forces interact to shape microbial communities and biogeochemistry. In this talk, I will explore the drivers of microbial diversity and activity across the Great Lakes, using data collected as part of an ongoing multi-year time series.

Becca Willett (Professor of Statistics and Computer Science & Director of AI at the Data Science Institute, UChicago)
May 5, 2022

In this talk, I will describe two recent efforts aimed at using machine learning to improve the use of physical models in climate forecasting. First, we will consider estimating the parameters of a physics-based climate simulator when the computational cost of running the simulator is very high. A novel framework for training an emulator of the simulator yields substantial computational gains over conventional methods as well as more robustness than naïve learned models. Second, we will consider learning to use ensembles of simulator outputs to perform subseasonal forecasting, yielding substantial improvements in accuracy relative to standard methods based on ensemble means or quantiles.

John Bates (Curator and Section Head, Life Sciences, Field Museum)
May 12, 2022

John Bates is curator of birds at the Field Museum and a member of the U. Chicago's committee on evolutionary biology. His talk will focus on ongoing research into avian evolution and biogeography with a focus on climate, and will also bring up national and international efforts and challenges to harness biodiversity data for these efforts.

Daniel Holz (Professor, Departments of Physics / Astronomy & Astrophysics, UChicago)
May 19, 2022

Gravitational waves are ripples in the fabric of spacetime. For the past few years we have been listening to gravitational waves from the collisions of black holes. These are teaching us a lot about the universe, including providing clues as to how the first stars were born, providing insight into where all the gold and platinum in the universe comes from, and helping us to measure the age of the universe.

Greg Dwyer (Professor, Ecology & Evolution Department)
May 26, 2022

Insects that defoliate forest trees can cause widespread defoliation that costs millions of dollars annually.   Insect defoliation would be far worse if not for fatal pathogens that decimate insect populations but knowing when pathogens can be relied on requires mathematical models of pathogen spread, which in turn must be parameterized using high-performance statistical computing.  In this seminar I will explain how we use our knowledge of pathogen ecology to project the effects of pathogens on insects.  

Jess Kunke (PhD student, Statistics Department, University of Washington)
June 2, 2022

Correlation does not imply causation, and we have been taught to be wary of statistical analyses that attempt to infer causality, yet this is often a primary research objective. The field of causal inference works to address this gap by studying questions such as the following: How much can we tell about the causal relationships in our system of interest from observational data alone? How can we combine observations and domain knowledge to further constrain the causal graph? If we have the ability to collect data on natural experiments or conduct experiments ourselves, which possible experiments will most help us constrain these causal relationships? I will share what I have been learning about causal inference theory, methods, misconceptions, and applications. I am particularly interested in exploring the intersections between causal inference and systems science or dynamical models. I can also briefly share about two projects I have been working on in other areas, namely network survey methods/population size estimation and environmental metagenomics model development.