- Published on
Building Geospatial Visualisations In Python
data science- Authors
- Name
- Ndamulelo Nemakhavhani
- @ndamulelonemakh
There are many tools that you can use to draw geographic charts in Python. This short guide is meant to show you the fundamental concepts that apply to most of these tools.
Pre-requisites
The 3 most important things you need to plot a geographic map are
Geometry - Describes the geographical boundries of the location(s)
Values - The information to be displayed per location e.g. population
- Typically available as CSV or other structured data formats
A geo-visualisation library
Common Geo-Visualisation Libraries
- Useful for quickly creating static geo maps
- Under the hood Folium uses leaflet.js to create interactive maps
- An easy to use visualisation library based on Dash
- A python wrapper for the Google Earth Engine platform
- Probably the most scalable option(for obvious reasons)
In this short guide, we will only show the example on Folium, however the core concepts should be similar on the other alternatives.
We will use the following datasets
zaf_states.zip - Contains shapefiles that define provincial boundries in South Africa
Step 1. Preparing the data
- Before we proceed, we need to install geopandas to be able to read geographic files in Python
pip install geopandas
- Next we download the location data from the link provided above
# Download the zip file to the current directory
wget https://github.com/endeesa/peculia-blog-meta/raw/main/data/gis/zaf_states.zip
# Also download the life expectancy csv file
wget https://raw.githubusercontent.com/endeesa/peculia-blog-meta/main/data/gis/zaf_female_life_expectency_2011.csv
# Unzip the file
unzip zaf_states.zip
Note: These commands will only run on a bash shell or alternatively you can prefix each comand with ! to make it run from a jupyter code cell
- Load the data into Python
import pandas as pd
import geopandas
# Load the life expectancy data into a normal Pandas dataframe
df = pd.read_csv('zaf_female_life_expectency_2011.csv')
print("Preview first 2 rows:", df.head(), sep="\n")
# Load the geometry data into a GeoPandas dataframe
gdf = geopandas.read_file('zaf_state/ZAF_STATES.shp')
print("Preview first 2 rows:", gdf.head(), sep="\n")
Notice that the names of the provinces are provided in the NAME_1 column on the geopandas dataframe, whereas the life expectency dataset uses the location
Lets change this so that both datasets use the name location to describe the provinces. In addition we will discard all the other columns on the geopandas dataframe expect the geometry column
zaf_boundries = gdf.rename(columns={'NAME_1', 'location'}).loc[['location', 'geometry']]
# Also make sure we are using the correct crs(coordinate reference system)
# Folium expects our data to use: EPSG4326 aka Latitude/Longitufe
zaf_boundries = zaf_boundries.to_crs(epsg=4326)
# Finally make sure the locations in both datasets are both lower case
# It is also recommended to replace dashes with forwad slashes
zaf_boundries.location = zaf_boundries.location.apply(lambda s: s.lower().replace('-', '/'))
df.location = zaf_boundries.location.apply(lambda s: s.lower().replace('-', '/'))
We finally have a matching key between the quantity we want to display(i.e. life expectancy) and the corresponding geometry(or shape) of the locations
We will now pass this information directly to Folium to Plot a Chloropeth map
Step 2: Creating a Chloropeth map
import folium
# First save our geometry data as a geojson file
zaf_boundries.to_file('zaf_boundries.geojson', drivers='GeoJSON')
# The default location that will show on the map
zaf_coords = [-28.4792, 24.6727]
map_obj = folium.Map(location=zaf_coords, zoom_start=6, crs='EPSG3857')
folium.Choropleth(
geo_data='zaf_boundries.geojson',
name="choropleth",
data=df,
columns=["location", "age"],
key_on='feature.property.location', # json path to our linking key i.e. province
fill_color="BuPu",
fill_opacity=0.7,
line_opacity=0.2,
legend_name="Folium Chloropeth starter example",
).add_to(map_obj)
# Use this if you have more than one quantity to display e.g. life expectancy, population
folium.LayerControl().add_to(map_obj)
# Display map by printing it
print(map_obj)
# Alternatively save the map to an html file
map_obj.save("index.html')
- Thats all it takes, you can view the full script to reproduce this guide on Github
Closing Remarks
This was a brief guide to demonstate how to visualise geographic data in Python using the [Folium] library.
To learn more, we encourage you to continue exploring more options through the hyperlinks provided.
Thanks for reading!