CSDS Study Group Presentations: 2021
October 26, 2021
There is a recent surge in research focused on urban transformations via empirical analysis of neighborhood sequences. The alignment-based sequence analysis methods, which have their origin in biology and computer science where they were used for matching DNA sequences and analyzing strings, and used for studying life courses in sociology, have recently seen many applications in urban neighborhood change research. However, it is unclear whether these methods are suitable for this task or to what extent they are robust in terms of producing consistent and converging neighborhood sequence typologies. The presentation will focus on a comparative analysis of four sequence analysis methods. These methods are applied to the same data set – the 50 largest Metropolitan Statistical Areas (MSAs) of the United States from 1970 to 2010. These methods are found to produce different neighborhood sequence typologies, and their behavior varies across MSAs, thus prohibiting meaningful comparisons of similar studies. One method leverages the socioeconomic similarities of neighborhoods and is suggested to serve as the building block towards designing a meaningful sequence analysis method for neighborhood change research.
November 16, 2021
Public health researchers increasingly rely on geospatial data from diverse sources, including volunteered data, “found” data, and real-time data from sensor systems. Characterized by the 4 V’s, volume, variety, velocity, and veracity, the big data revolution has spurred significant advances in spatial and spatiotemporal analysis methodologies; however, less attention has been paid to issues of data quality, validity, and bias. Drawing on examples from research on COVID-19 and urban pests, I highlight the importance of “placing” geospatial health data – thinking critically about how social processes and technologies selectively structure quantitative health-related data and in turn affect our understandings of contextual and neighborhood effects on health and well-being.
November 17, 2021
I briefly review the state-of-the-art spatial regression discontinuity (RDD) framework that is suitable to isolate causal effects when there are sharp spatial breaks. I then propose several improvements. First, I illustrate that some commonly used specifications are prone to type-I errors, especially when there are spatial trends in the data. Second, I propose a way to propose heterogeneous treatment effects alongside the RD cutoff. Third, I introduce randomization inference to the spatial RD framework by creating a set of functions that allow to randomly shift borders. These tools might be interesting for other identification strategies that rely on the shift of boundaries. A companion R-package called "SpatialRDD" includes all the tools necessary to carry out spatial RD estimation, including the proposed improvements.
December 1, 2021
Geospatial visualization is even more complex, and researchers need to consider a wide range of spatial scales and complex boundaries of polygons, the underlying uni- or bi-variate nature of maps, difficulties of diverse and unfamiliar file formats (or databases), statistical and visual pitfalls like the ecological problem (MAUP), lack of errors bars, etc.. Recent efforts in open source and start-up spaces provide a hopeful outlook. Open source development in analysis and visualization have provided powerful tools like geoda (analysis and modeling) and deck.gl (visualization)—but such tools are sometimes technically inaccessible to analysts. Start-up ventures offer clear insights and low-code tools for geospatial data, but longevity, data ownership, and academic support may be uncertain.
To attempt to resolve these issues, two domains have opened up over the past 10 years.
In this talk we will present the current state and future vision for a suite of open source tools that we are developing to try and bridge this divide. Over the course of the next year our goal is to develop an extensible, modifiable suite of tools that can be used alone or together as part of a cohesive platform to make finding and sharing geospatial based insights easier. In this talk in particular we will focus on
- Matico Dashboards: A quick and easy way to build out interactive that can pull data from multiple different sources to produce maps, charts and explanatory text with zero coding.
- GeoJay: a service to easily identify and join tabular data to geospatial administrative boundaries
- Matico Server: A federated data management tool which allows individuals and communities to collectively generate, edit and share geospatial data while retaining ownership of that data.
We will contrast our approach with existing proprietary platforms, particularly emphasize our focus on community management, data ownership and stewardship, collaborative investigation and extensibility.