This webinar will provide an introduction to basic tools of spatial data analysis and modelling that are available as open-source libraries in Python. It will consist of two parts: GeoPandas and the parallelization of spatial joins.
The first portion of the talk will introduce the core API of GeoPandas and emphasize how it can be used to link relational data structures to geometric objects that represent spatial subsets. This will include a brief introduction to loading data from geodatabases, mapping, aggregating, spatial filtering, and simple spatial joins. The second portion of the talk will focus on how to implement parallel versions of spatial join operations. This will examine alternatives that have been developed in the data science community for providing more efficient handling of GEOS pointers and subclassing Dask DataFrames.
The first part of this talk assumes a considerable familiarity with python syntax and some exposure to advanced data structures (specifically DataFrames). The second part of the talk will assume some additional knowledge of Dask and Numpy. Exposure to concepts from Remote Sensing or Geographic Information Systems will be useful for understanding the examples but it is not required.