CSDS Study Group Presentations: 2018

Winter Quarter 2018

Alessandro Araldi
January 9, 2018

Read the paper here

January 26, 2018

Stuart will highlight some of our recent efforts to build a spatial data science community (see the recent conference we co-hosted), including project examples at the intersection of spatial analytics and machine learning. 

January 16, 2018

Robert Manduca, Sociology and Social Policy PhD student at Harvard

Marynia Kolak
February 6, 2018

Social determinants of health relate to the dimensions where people work, live, and play, and are thought to account for the majority of an individual’s health outcomes. While chronic health outcomes are strongly correlated with social determinants (like education or food accessibility), causal links remain unclear because of challenges in testability and complicated causal pathways. At the same time, physical determinants of health like pollution are known to impact health but likewise interact with social determinants. Social determinants impact both exposure and susceptibility, further complicating testability and underlying assumptions.

To address these challenges, a research agenda is proposed to quantify the marginal effects of the social and physical, or structural, determinants of health. I first propose to abstract structural determinants for a case-study urban environment, the City of Chicago, that represents a complex fabric of multiple components. Social Determinants of Health, Physical Environment Components, and Urban Fabric Typologies will be refined in a multi-phase mixed methods approach including factor analysis, regionalization clustering, remote sensing, Bayesian networks, and computational processing and storage techniques. Physical determinants will include, at a minimum, the development of particle pollution (PM2.5 )index using MODIS and ground sensor data (including the Array of Things) and a lead (Pb) soil contamination index using historical geological data and hyperspectral imagery. The urban fabric typology will be developed using building footprint data, building heights extracted from LiDar data, and Bayesian networks.

Then, a spatially-explicit quasi-experimental approach will test conditional correlations using a matching framework to distill marginal effects of each component across different health conditions. Two options are being considered to evaluate change over time for specific health outcomes, both incorporating clinical data from research partners.

Yen-Tyng Chen
February 13, 2018

Young black men who have sex with men (YBMSM) are the only group where HIV incidence has increased in the United States. Pre-Exposure Prophylaxis (PrEP) is an effective prevention medication to prevent HIV infection for at risk groups, including YBMSM. The higher rates of infection among YBMSM are due to a combination of increased exposure to HIV risks and structural barriers that limit HIV prevention and care. However, no research has examined how social contexts influence PrEP awareness. This study aims to examine the relationship of place and network with PrEP awareness among YBMSM.

Haozhi Pan
February 27, 2018

Urban land use structure is shaped by a preference for production externalities by firms and for shorter commuting time and proximity to amenities by residents. Drawing on previous literature on urban spatial equilibrium and complex urban systems, this research explores how the spatio-temporal evolution of land use structure in Chicago (down to the 30x30-meter resolution) is shaped by connectivity to population and job centers, amenities, and transportation infrastructure. We further examine the land use structure in Chicago by differentiating residential land use by household characteristics and commercial land use by economic sectors. For households, we analyze 8-year address changes to find mobility patterns of families with different income levels in Chicago. Also, the spatial allocation of various economic sectors is modeled to complement long-run forecasts created with the Chicago Regional Economic Input-Output Model. We find strong spatio-temporal heterogeneity and departure from previous literature on how firms and residents choose to locate and relocate in Chicago.

March 6, 2018

Marynia Kolak, Center for Spatial Data Science, University of Chicago

Northwestern patient data and AHA hospital data

Jamie Saxon, Center for Spatial Data Science, University of Chicago

AMA physicians data (national)

Liz Tung, UChicago Medicine

Patient data from EHR or survey

Puneet Chehal, Assistant Professor, Emory University

Medicare data



Mike Powe & Carson Hartmann
March 27, 2018

Over the past 9 years, the Preservation Green Lab has applied a variety of rigorous research methodologies, including spatial data analysis, to the study of a central question: What is the value of old buildings? Past projects have demonstrated the environmental value of building reuse and the connections between older, smaller buildings and economic vitality. Currently, Green Lab researchers are launching a study focused on the influence of the built environment on affordability and risk of displacement in historically African American neighborhoods and in commercial corridors with concentrations of black-owned businesses. How do building characteristics and changes to the built environment affect affordability and risk of displacement? Do all neighborhoods follow similar patterns over time? This presentation will include a quick recap of past Preservation Green Lab research and a conversation about potential best paths forward for the PGL’s new project.

Spring Quarter 2018

Olivier Scaillet
April 10, 2018

We develop new higher-order asymptotic techniques for the Gaussian maximum likelihood estimator (henceforth, MLE) of the parameters in a spatial panel data model, with fixed effects, time-varying covariates, and spatially correlated errors. To improve on the accuracy of the extant asymptotics, we introduce a new saddlepoint density approximation, which features relative error of order O(m−1) for m = n(T − 1) with n being the cross-sectional dimension and T to diverge time-series dimension. The main theoretical tool is the tilted-Edgeworth technique, which yields a density approximation that is always non-negative, does not need resampling, and is accurate in the tails. We provide an algorithm to implement our saddlepoint approximation and we illustrate the good performance of our method via numerical examples. Monte Carlo experiments show that, for the spatial panel data model with fixed effects and T = 2, the saddlepoint approximation yields accuracy improvements over the routinely applied first-order asymptotics and Edgeworth expansions, in small to moderate sample sizes, while preserving analytical tractability.

Read the paper here.

Nick Mader
April 24, 2018

The perpetual challenge for city planners is to maximize accessibility of programming by populations with diverse needs, given constraints of resources and complex logistics. For example, in the early childcare service sector, city planners have scarce funding resources (e.g. through Head Start) to distribute among providers of different types, locations, and histories. For any given resource allocation, it is hard for planners to assess whether they could increase enrollment —especially among priority populations — by shifting resources to providers in different areas, since it is hard to gauge the quantity, types, and distribution of program demand throughout the city, and how it might re-equilibrate to a new supply scheme.

This presentation will describe the motivation for the CANOPY project, and lay out the technical details around each component of a tool for planners to use as a guide in making allocation decisions for human services. This includes the use of discrete choice methods, drawn from economics methodology, to infer parameters governing the diversity of household tasks for up-take of program services from historical data.

Contact with city decision-makers to articulate (1) the components of an objective function and (2) constraints related to limited resources, and logistics. Optimization methods drawn from operations research and engineering to generate solution recommendations. Potentially applications to early childhood funding allocation, and spatial/type distribution of youth afterschool programs will be discussed.

Learn more about the CANOPY project.

May 1, 2018

Coro Chasco, Associate Professor in Applied Economics, Universidad Autónoma de Madrid

Angela Li
May 8, 2018

Angela Li's presentation is available here.

May 15, 2018
Alex C. Engler, Program Director, Computational Analysis and Public Policy

See the Observable Notebook from the presentation here

May 29, 2018
Meru BhanotPhD student in Economics, University of Chicago 


Autumn Quarter 2018

Caglar Koylu
October 23, 2018

Richárd Farkas
October 30, 2018

A large body of the industrial organization literature demonstrates asymmetric cost pass-through in markets, which is related to whether the market has price taker or price maker firms. However, lots of studies try to shed light on the relation between cost pass-through behavior of firms and the level of competition - usually through some measure of concentration. Indeed, research evidence suggests that market concentration is not a reliable proxy for level of competition in any case. The present research tries to extend cost pass-through analysis with spatial methods to capture the linkages between how firms pass on costs and the level of competition in pricing.

Read the paper here.

Corey Tabit
November 6, 2018

Cardiovascular disease is the #1 killer of American adults. Current risk-prediction tools which focus on traditional cardiovascular risk factors are imperfect, tending to underestimate or overestimate risk in specific socio-demographic groups. These tools ignore social determinants of health completely. If healthcare providers had a better way to understand and quantify social determinants of cardiovascular health, and to measure their effects in real-time, new interventions could be developed and targeted directly to the highest-risk patients during their periods of peak risk. This presentation will discuss the current state of cardiovascular risk prediction, summarize known social determinants of cardiovascular health, highlight several interventions with specific efficacy in low-socioeconomic patients, and suggest ways in which spatio-temporal analytics could be used to develop more accurate risk-prediction tools.

Dr. Babak Mahdavi Ardestani
November 13, 2018

Social and urban systems/ phenomena typically involve multidimensional factors while containing many interdependent constituents which are often interacting nonlinearly along with structures spanning several scales. It can be argued that every complex social/ urban system is the result of a particular configuration of individuals, their dispositions, beliefs/ behaviors, and environment. This cornerstone of methodological individualism can be operationalized using agent-based modeling to study the emerging behaviors of these complex systems. These are discussed using two data-driven models (with spatial elements) of residential segregation and social care systems.

Vassilis Tselios
November 27, 2018

It is widely known that the spatial distribution of economic activities is highly clustered. Localities with relatively high  (low) economic activities are and remain localized close to other localities with relatively high (low) economic activities. Scholars usually use census and statistical datasets to measure economic activities and therefore to explore urban and regional clusters  of economic activities, but these datasets suffer from some limitations, e.g. they are not available at a very disaggregated level (spatial limitation), they are not available on monthly, weekly or daily basis (temporal limitation), and they do not include  the informal economy (missing information). Images of nighttime lights acquired by the United States Air Force Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) sensor have been linked to economic activities. Taking into account  the limitations of the census and statistical datasets on the one hand, and the easy access, low cost and integrity of spatial and temporal cover of nighttime data on the other, we show that the nighttime data can provide reliable information on the spatial  distribution of economic activities at any spatial and temporal level. Using satellite images of nighttime lights and by employing Exploratory Spatial Data Analysis (ESDA), a) we explore urban and regional clusters of economic activities at different spatial  levels, and b) we investigate whether and to what extent these clusters change during and after a shock of an economy, such as a natural or technological disaster.

Read the paper here.