Environmental Data Science Lunch - Winter & Spring 2019
Liz Moyer, Julia Koschinsky & Marynia Kolak
February 7, 2019
Noon -1:30pm, Searle 240A
In the first of this series, Liz Moyer (RDCEP) and Julia Koschinsky & Marynia Kolak (Center for Spatial Data Science) will introduce their programs and their core research themes and show work currently underway by affiliated students and postdocs. We hope to connect young researchers with relevant interests. Liz will also give a brief update on the research traineeship program, upcoming classes and training options, and how to get involved.
Jim Franke, Geophysical Science, UChicago
February 14, 2019
This talk will introduce the different methods for predicting climate-related damages to agriculture, discuss the challenges and limitation of process-based crop models versus statistical models, and present results from a research project targeting a better understanding of yield sensitivity to the major dynamic drivers: temperature, water, nitrogen, and CO2. We will also discuss the roll of synthetic nitrogen fertilizers in food production, investigate the interaction between nitrogen and growing season temperature, and pose some potential implications for global food security in a warmer climate.
Eamon Duede, Knowledge Lab, UChicago
February 21, 2019
Macroeconomic theories of growth and wealth distribution have an outsized influence on national and international social and economic policies. Yet, due to a relative lack of reliable, system wide data, many such theories remain, at best, unvalidated and, at worst, misleading. In this paper, we introduce a novel economic observatory and framework enabling high resolution comparisons and assessments of the distributional impact of economic development through the remote sensing of planet Earth’s surface. Striking visual and empirical validation is observed for a broad, global macroeconomic σ-convergence in the period immediately following the end of the Cold War. What is more, we observe strong empirical evidence that the mechanisms driving σ-convergence failed immediately after the financial crisis and the start of the Great Recession. Nevertheless, analysis of both cross-country and cross-state samples indicates that, globally, disproportionately high growth levels and excessively high decay levels have become rarer over time. We also see that urban areas, especially concentrated within short distances of major capital cities were more likely than rural or suburban areas to see relatively high growth in the aftermath of the financial crisis. Observed changes in growth polarity can be attributed plausibly to post-crisis government intervention and subsidy policies introduced around the world. Overall, the data and techniques we present here make economic evidence for the rise of China, the decline of US manufacturing, the euro crisis, the Arab Spring, and various, recent, Middle East conflicts visually evident for the first time.
Read the paper here
Suzanne Pierce, Texas Advanced Computing Center and Environmental Science Institute
February 22, 2019
Research on Intelligent Systems and Geosciences (IS-GEO) is an emerging area of interest for advanced research that crosses topical expertise from the Geosciences with Computer Sciences. Advanced computing and artificial intelligence are paving new pathways for geoscientific discoveries. Data volume and computational processing demand within the geosciences has grown exponentially over the last few decades. Leveraging knowledge from across computational and geoscience fields is achieving new insight at a rapid pace. AI and knowledge-based reasoning promise to accelerate discovery in the next century as approaches in adaptive sensing, machine learning, deep learning, knowledge graphing and other approaches are increasingly applied across the geosciences. Despite the demonstrated power of these techniques, their application to the geosciences is limited to date. The EarthCube NSF-sponsored Research Coordination Network for Intelligent Systems in Geosciences, or IS-GEO, is an emerging community of researchers with active working groups that support methodology and technique sharing, encourage cross-disciplinary research collaborations, and development of data benchmarks to accelerate progress. IS-GEO collaborations are opening new approaches and instigating transitions in the most basic knowledge of geoscience processes through improvements in the process, tools, and approaches to understanding the Earth. This presentation will introduce ways to engage with the community (www.is-geo.org), highlight advances from IS-GEO research and discuss how future applications and theoretical advances at the intersection between Geosciences and Intelligent Systems will enable novel forms of reasoning and learning about Earth.
Spencer Carran, Ecology and Evolution, UChicago
February 28, 2019
Disease dynamics occur on a heterogeneous background of susceptible individuals, leading naturally to an unequal distribution of risk and disease burden, with some individuals and subpopulations at higher risk of disease. Public health management efforts seek to reduce risk across a population, but failing to account for underlying variation in background risk can lead to interventions that allocate resource inefficiently in terms of relative reductions in risk. In order to develop equitable health policy that works to reduce disparities in disease burden, we will consider how public health programs may allocate vaccination resources to the geographic regions and age groups at greatest need.
Considering first the problem of outbreak response, we use geo-tagged case surveillance reports to generate high-resolution maps of remaining susceptible individuals and epidemic speed, allowing rapid (but approximate) identification of regions with greater or lesser need for additional vaccination. Accordingly, we propose a method for assigning distinct outbreak response policies across a spatial gradient. Secondly, we consider the case of prophylactic supplemental vaccination campaigns (ie before an outbreak has started). We develop a novel catalytic model which integrates a region's disease exposure history (as recorded through passive surveillance) and vaccination coverage (as reported by Demographic and Health Surveys) in a Bayesian framework to estimate the overall population profile of immunity, and compare these estimates against direct serological measures of immune status. This enables identification not only of regions with greater need for vaccine reinforcement, but also estimation of biases in standard disease surveillance systems, suggesting that traditional surveillance may benefit from supplementation with serological data.
Rebecca Willett, Statistics and Computer Science, UChicago
March 7, 2019
Sparse models for machine learning and statistics have received substantial attention over the past two decades. Model selection, or determining which features are the best explanatory variables, is critical to the interpretability of a learned model. Much of this work assumes that features are only mildly correlated. However, in modern applications ranging from functional MRI to seasonal climate forecasting, we observe highly correlated features that do not exhibit key properties (such as the restricted eigenvalue condition). In this talk, I will describe novel methods for robust sparse linear regression in these settings. Using side information about the strength of correlations among features, we form a graph with edge weights corresponding to pairwise correlations. This graph is used to define a graph total variation regularizer that promotes similar weights for highly correlated features. I will show how the graph structure encapsulated by this regularizer helps precondition correlated features to yield provably accurate estimates. The proposed approach outperforms several previous approaches for seasonal climate forecasting using a combination of observed and LENS sea surface temperatures and precipitation data.
This is joint work with Yuan Li, Ben Mark, Abby Stevens, and Garvesh Raskutti.
Luis Bettencourt, Mansueto Institute
March 21, 2019
I will present of a range of recent and ongoing projects that map various social aspects of cities and their spatial distribution. I will start with work that maps neighborhood effects in US Metropolitan Areas in terms of selection and shows how income and educational attainment distributions deviate locally from city wide statistics and how these deviations should be measured in terms of information. Second, I will present some results on a recent effort to map the spatial inequality of service delivery in cities of Brazil and South Africa. I will show an analysis of spatial inequality for a composite index of sustainable development and preview issues of expanding inequity even under scenarios where service delivery improves over time. Finally, I will present recent research on the spatial analysis of maps of slums (neighborhoods with incipient infrastructure) and how these analyses lead naturally to new types of urban planning tools that identify infrastructure deficits and suggest additional street plan expansions. I will discuss challenges and opportunities for taking this kind of analysis to scale, neighborhood by neighborhood, in entire developing cities. In all three cases, I will point to research ongoing at the Mansueto Institute for Urban Innovation and opportunities for discussions and collaborations.
Mihai Anitescu, Mathematics and Computer Science Division, Argonne National Laboratory
April 4, 2019
I will present several ongoing projects in space time statistics at Argonne, in the context of the “Multifaceted Mathematics for Rare, High Impact Events in Complex Energy and Environmental Systems” project, which I lead. I will discuss high resolution data assimilation of LIDAR data, Gaussian Process/Spectral Modeling of high-resolution space-time fields, hybrid measurement-numerical weather prediction models, and rare event computations for dynamical models. The work is joint with Ahmed Attia, Julie Bessac, Emil Constantinescu, Charlotte Haley, and Vishwas Rao from Argonne.
Amir Jina, Harris School of Public Policy, UChicago
April 11, 2019
Estimates of climate change damage are central to the design of climate policies. Here, we develop a flexible architecture for computing damages that integrates climate science, econometric analyses, and process models. We use this approach to construct spatially explicit, probabilistic, and empirically derived estimates of economic damage in the United States from climate change. The combined value of market and nonmarket damage across analyzed sectors—agriculture, crime, coastal storms, energy, human mortality, and labor—increases quadratically in global mean temperature, costing roughly 1.2% of gross domestic product per +1°C on average. Importantly, risk is distributed unequally across locations, generating a large transfer of value northward and westward that increases economic inequality. By the late 21st century, the poorest third of counties are projected to experience damages between 2 and 20% of county income (90% chance) under business-as-usual emissions (Representative Concentration Pathway 8.5).
Dave Jablonski, Geophysical Sciences, UChicago
Stewart Edie, Geophysical Sciences, UChicago
April 19, 2019
Understanding the drivers of large-scale patterns in biodiversity require commensurate datasets and analytical approaches that are comparative and hierarchical. Our work on the drivers of marine biodiversity gradients are rooted in the compilation, maintenance, and analysis of a ~6,000 species-80,000 geographic occurrence database of marine bivalves. We examine temporal and spatial drivers of the most prominent global feature of biodiversity -- the latitudinal and longitudinal gradient in species richness, ecological variety, and morphological disparity -- by coupling this spatial dataset with temporal diversity dynamics from the fossil record and with a set of key abiotic factors such as sea surface temperature.
James Osborne, Oriental Institute
April 25, 2019
Archaeology of the ancient Middle East took a major methodological leap in the late twentieth century with the introduction of satellite imagery into its repertoire of tools, driven largely by the University of Chicago’s Oriental Institute and its Center for Ancient Middle Eastern Landscapes (CAMEL). Although extensive regional surveys of archaeological sites across entire landscapes as well as intensive surveys of single sites had already been taking place for some time, the introduction of satellite imagery, especially declassified images taken by the CORONA mission of the 1960s and 1970s, had a transformative impact on the types of results that could be obtained. I show the two types of archaeological survey that occur in Near Eastern archaeology, one an extensive multiperiod landscape survey in the Kurdistan region of northern Iraq, and one an intensive survey of a single Iron Age city in southern Turkey. Both of these case studies provide a wealth of information about ancient urbanism that would take generations to obtain through traditional excavation, but both have their own methodological drawbacks. I close by presenting two survey projects in central Turkey that will (permits pending) begin this summer.
Rayid Ghani, Center for Data Science and Public Policy and Computer Science
May 2, 2019
Anthony Lauricella, Center for Ancient Middle Eastern Landscapes & the Oriental Institute
May 16, 2019
This talk will give a brief introduction to the work of the Center for Ancient Middle Eastern Landscapes (CAMEL) at the Oriental Institute. The CAMEL lab has been a leader in the field of landscape archaeology since its founding by Tony Wilkinson in the 1990s. I will provide some background on past research into Middle Eastern landscapes in both our lab and the Oriental Institute, and then turn to a description of recent landscape-scale projects using both historical and contemporary satellite imagery. This work, part of the Afghan Heritage Mapping Partnership, involves a systematic attempt to remotely survey important archaeological landscapes in Afghanistan for previously undocumented archaeological sites in otherwise inaccessible areas.
Daniel Sanz-Alonso, Department of Statistics
May 23, 2019
Tamma Carleton, Energy Policy Institute
May 30, 2019
Combining satellite imagery with machine learning presents an opportunity for assembling globally comprehensive observations of many variables simultaneously. However, current approaches require custom systems, expert knowledge, access to imagery, and extensive computational resources. We develop a generalized system that enables researchers with basic statistical training to use satellite imagery and machine learning to study any variable visible from space at low computational cost. We demonstrate the generalizability of our system by constructing high resolution estimates for seven domains across the globe; we find comparable performance to a state-of-the-art convolutional neural network at a fraction of the cost.
Jessica Kunke, Department of Statistics
June 6, 2019
Bootstrapping is not a single method but a variety of methods that can be used to help estimate uncertainties and test hypotheses in a wide array of contexts. We’ll introduce the basic principles of bootstrapping and the types of questions or goals that bootstrapping can help address. We’ll also go through some concrete examples and discuss the theory and assumptions behind the methods we examine.
Kevin Schwarzwald, Energy Policy Institute
June 13, 2019
Climate data interacts with the uncertainty in future societal impacts of climate change in complex ways. I'll touch on how climate data is used and misused to project future economic damages from climate change, and talk about how climate uncertainties affect uncertainties in impacts. I'll focus specifically on how changes in climate variability are projected into the future and introduce some of our upcoming research, which quantifies the impact of variability changes on a well-known economic damage function relating temperature and mortality.

