00

Setup & Choose Your City

Configure your environment and select any Indonesian city. Every code example, dataset download, and analysis in this course will adapt to your choice.

Select Your Study City

All tutorials, coordinates, EPSG codes, and data downloads will automatically adapt.

Working with: SemarangJawa Tengah

📦 Download Starter Kit for Semarang

Everything you need in one ZIP — pre-configured for your selected city. Includes 37 Jupyter Notebooks, 37 source scripts, SQL files, requirements, and a ready-to-use city_config.json.

📦
Complete Course Kit (.zip)
All 76 files: notebooks (.ipynb), scripts (.py/.sql/.sh), city_config.json for Semarang, requirements.txt, and README. Just unzip, install, and run.
⚙️
City Config Only
Just the city_config.json for Semarang — swap this file to re-run the entire pipeline for a different city.
📄
Requirements Only
Python dependency list. Install with pip install -r requirements.txt inside your conda environment.

💻 Quick Start After Download

unzip urban_analytics_semarang.zip
cd urban_analytics
conda create -n urban-geo python=3.11 -y && conda activate urban-geo
pip install -r requirements.txt
jupyter lab

📚 Contents by Module

The ZIP includes these notebooks and scripts:

End-to-End Pipeline

Each module produces artifacts the next module consumes. By the end, you'll have a live dashboard integrating every layer.

M0 Setup M1 Python M2 Data M3 GeoPandas M4 Network M5 PostGIS M6 GEE M7 Dashboard

Why This Course Exists

Indonesian cities are growing faster than the tools planners use to understand them. Manual GIS workflows can't keep pace with the scale of land-use change, transport demand, and coastal risk across Java, Sulawesi, Kalimantan, and Sumatra.

This course teaches you to build an automated, reproducible urban analytics pipeline — from raw open data to an interactive dashboard — using only free, open-source tools. Pick any city; the same code works everywhere.

🔄
Reproducible
Change the city name; the entire pipeline re-runs. No manual steps.
🆓
Open Source
Python, PostGIS, GEE, OSM — no license fees, no vendor lock-in.
🇮🇩
Indonesia-First
BPS codes, RDTR zoning, KDB/KLB, ojol transport — built for local context.
📊
Decision-Ready
Ends with a Streamlit dashboard that non-technical stakeholders can explore.

Course Structure: Diátaxis

Every module is organized into four modes, based on the Diátaxis framework:

💡
Understand (Explanation)
The "why" — concepts, mental models, and design decisions. Read this when you want to deepen understanding.
🎯
Follow Along (Tutorial)
Step-by-step hands-on. Each module's tutorial picks up exactly where the last one left off. This is the spine of the course.
📋
Recipes (How-to)
Task-oriented snippets: "How do I reproject a shapefile?" "How do I create an H3 grid?" Quick answers when you know what you need.
📖
Reference
Function signatures, API parameters, EPSG lookup tables. The facts, without narrative.

Step-by-Step: Environment Setup

1
Install Miniconda

Download and install Miniconda from the official site. This gives you the conda package manager without bloat.

bash
# Linux / macOS
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

# Or macOS ARM (Apple Silicon)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
bash Miniconda3-latest-MacOSX-arm64.sh
2
Create the project environment
bash
conda create -n urban-geo python=3.11 -y
conda activate urban-geo
3
Install Python packages
bash
pip install pandas geopandas matplotlib seaborn shapely fiona pyproj \
    osmnx networkx h3 requests beautifulsoup4 \
    psycopg2-binary sqlalchemy geoalchemy2 \
    earthengine-api geemap \
    streamlit pydeck plotly folium
Expected
Successfully installed 50+ packages. No errors.
4
Install PostGIS

PostGIS needs PostgreSQL as a base. Install both:

bash
# Ubuntu/Debian
sudo apt install postgresql postgis -y

# macOS (Homebrew)
brew install postgresql@16 postgis

# Start the service
sudo systemctl start postgresql   # Linux
brew services start postgresql@16 # macOS
5
Create the spatial database
bash
sudo -u postgres createdb urban_semarang
sudo -u postgres psql -d urban_semarang -c "CREATE EXTENSION postgis; CREATE EXTENSION h3;"
Expected
CREATE EXTENSION
CREATE EXTENSION
Note

If h3 extension is not available, install h3-pg from source or via pgxn install h3. The h3 extension is optional — you can compute H3 in Python instead.

6
Create project folder structure
bash
mkdir -p urban_analytics/{data/raw,data/processed,output,notebooks,scripts}
cd urban_analytics
Your folder tree
urban_analytics/
├── data/
│   ├── raw/          # Downloaded shapefiles, CSVs, GeoJSON
│   └── processed/    # Cleaned, reprojected, H3-indexed
├── output/           # GeoPackage, GeoTIFF, exports
├── notebooks/        # Jupyter notebooks (one per module)
└── scripts/          # Reusable .py and .sql files
7
Authenticate Google Earth Engine
bash
earthengine authenticate

Follow the browser prompt to authorize. This is needed for Modules 6-7.

8
Verify everything works
python
import pandas as pd
import geopandas as gpd
import osmnx as ox
import h3
import ee

ee.Initialize()

print("pandas", pd.__version__)
print("geopandas", gpd.__version__)
print("osmnx", ox.__version__)
print("h3", h3.__version__)
print("GEE OK")
print("\n✓ All systems go. Ready for Module 1.")
Expected
pandas 2.x.x
geopandas 1.x.x
osmnx 2.x.x
h3 4.x.x
GEE OK

✓ All systems go. Ready for Module 1.
Artifacts produced

urban_analytics/ folder structure • urban_semarang PostGIS database • urban-geo conda environment

City Configuration Reference

These values update automatically when you change the city selector above.

ParameterValue
City NameSemarang
ProvinceJawa Tengah
OSMnx QuerySemarang, Indonesia
Center Latitude-6.97
Center Longitude110.42
UTM EPSGEPSG:32749
UTM Zone49S
BPS Kota Code3374
CoastalYes
Bounding Box[110.28, -7.12, 110.53, -6.88]

Indonesian Data Sources

SourceData TypesCoverageAccess
OpenStreetMap / GeofabrikRoads, buildings, POIs, landuseAll citiesFree, bulk download or API
BPS (bps.go.id)Population, economics, housing censusAll kabupaten/kotaFree, web + API
InaGeoportalAdmin boundaries, RDTR zoning, land parcelsVaries by cityFree, download
Google Earth EngineLandsat, Sentinel-2, SRTM DEM, land coverGlobalFree (research/education)
HERE Traffic APIReal-time traffic flow, incidents, routingMajor roads globallyFreemium (250k calls/mo free)
BMKGWeather, climate, earthquake, tsunamiNationalFree, web + API
01

Python Foundations

Core Python, pandas, and matplotlib — building the data-handling toolkit you'll use in every subsequent module.

M0 Setup M1 Python M2 Data M3 GeoPandas

Why Python for Urban Analytics?

Python is the glue language. It connects spatial libraries (GeoPandas), network analysis (NetworkX), databases (psycopg2), satellite imagery (GEE API), and dashboards (Streamlit) in a single workflow. No other language has this breadth.

In this course, Python isn't the destination — it's the vehicle. You'll learn just enough to drive it confidently through 7 modules of increasingly powerful geospatial tools.

Three Concepts That Matter Most

1
DataFrames are the universal container
Every dataset — census, buildings, traffic — becomes a pandas DataFrame (table with labeled columns). Master .groupby(), .merge(), and .apply() and you can handle 90% of urban data tasks.
2
Functions make analysis reproducible
Wrap your analysis in functions with parameters (city name, year, threshold). Run for Semarang today, switch to Surabaya tomorrow — same code, different inputs.
3
Dictionaries are configuration objects
Store your city parameters (coordinates, EPSG, BPS code) in a dict. Pass it to every function. This is how you make the pipeline city-agnostic.
Prerequisites
  • Module 0 completed: urban-geo conda environment active
  • urban_analytics/ folder structure created

Build Your City Configuration

1
Create the city config notebook

Open JupyterLab and create notebooks/01_python_foundations.ipynb. This notebook will produce a reusable city config that every future module imports.

python
# Cell 1 — City Configuration
# Change this ONE dict to switch the entire pipeline to a new city

CITY = {
    "name": "Semarang",
    "province": "Jawa Tengah",
    "osm_query": "Semarang, Indonesia",
    "lat": -6.97,
    "lon": 110.42,
    "epsg_utm": 32749,
    "bps_code": "3374",
    "coastal": True,
    "bbox": [110.28, -7.12, 110.53, -6.88],
}

print(f"Working with: {CITY['name']}, {CITY['province']}")
Expected
Working with: Semarang, Jawa Tengah
2
Practice with district data

Build a DataFrame of sample districts. We'll replace this with real BPS data in Module 2.

python
# Cell 2 — Practice DataFrames
import pandas as pd

# Sample district data — we'll fetch real data in Module 2
districts = pd.DataFrame({
    "kecamatan": ["Kec A", "Kec B", "Kec C", "Kec D", "Kec E"],
    "population": [85000, 120000, 65000, 95000, 110000],
    "area_km2": [12.5, 8.3, 22.1, 15.6, 9.8],
    "elevation_m": [3, 85, 210, 45, 5],
})

# Derived columns
districts["density"] = (districts["population"] / districts["area_km2"]).round(0)
districts["flood_risk"] = districts["elevation_m"].apply(
    lambda e: "high" if e < 5 else ("medium" if e < 20 else "low")
)

print(districts.to_string(index=False))
Expected
kecamatan  population  area_km2  elevation_m  density flood_risk
    Kec A       85000      12.5            3   6800.0       high
    Kec B      120000       8.3           85  14458.0        low
    Kec C       65000      22.1          210   2941.0        low
    Kec D       95000      15.6           45   6090.0        low
    Kec E      110000       9.8            5   11224.0    medium
3
Groupby and aggregation
python
# Cell 3 — Summary by risk category
summary = districts.groupby("flood_risk").agg(
    n_kecamatan=("kecamatan", "count"),
    total_pop=("population", "sum"),
    avg_density=("density", "mean"),
).round(0)

print(summary)
4
Create reusable helper functions
python
# Cell 4 — Functions we'll reuse throughout the course
import math

def haversine(lat1, lon1, lat2, lon2):
    """Distance in km between two GPS points."""
    R = 6371
    dlat = math.radians(lat2 - lat1)
    dlon = math.radians(lon2 - lon1)
    a = math.sin(dlat/2)**2 + math.cos(math.radians(lat1)) * \
        math.cos(math.radians(lat2)) * math.sin(dlon/2)**2
    return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))

def calculate_kdb(footprint_m2, land_area_m2):
    """KDB = building footprint / land parcel area."""
    return round(footprint_m2 / land_area_m2, 3) if land_area_m2 > 0 else 0

def classify_density(pop_per_km2):
    """Indonesian urban density classification."""
    if pop_per_km2 > 15000: return "very_high"
    if pop_per_km2 > 10000: return "high"
    if pop_per_km2 > 5000: return "medium"
    return "low"

# Test
print(f"Distance city center to bbox corner: "
      f"{haversine(CITY['lat'], CITY['lon'], CITY['bbox'][1], CITY['bbox'][0]):.1f} km")
print(f"KDB of 250m² on 400m² lot: {calculate_kdb(250, 400)}")
print(f"14,000 pop/km²: {classify_density(14000)}")
5
Save the city config for reuse

Export the CITY dict as JSON so every future notebook can import it.

python
# Cell 5 — Save config
import json
from pathlib import Path

config_path = Path("../data/city_config.json")
config_path.write_text(json.dumps(CITY, indent=2, ensure_ascii=False))
print(f"Saved to {config_path.resolve()}")
Artifacts produced → used by Module 2

data/city_config.jsonnotebooks/01_python_foundations.ipynb • Helper functions: haversine(), calculate_kdb(), classify_density()

Quick Recipes

Read CSV with Indonesian encoding

python
df = pd.read_csv("data.csv", encoding="utf-8", sep=";")  # BPS often uses semicolons

Filter rows by multiple conditions

python
high_risk = df[(df["elevation_m"] < 5) & (df["population"] > 50000)]

Pivot table for cross-tabulation

python
pivot = pd.pivot_table(df, values="population", index="kecamatan",
                       columns="year", aggfunc="sum")

Quick bar chart

python
import matplotlib.pyplot as plt

districts.sort_values("density").plot.barh(x="kecamatan", y="density", color="#2A7B9B")
plt.xlabel("Population Density (per km²)")
plt.title(f"Population Density by Kecamatan — {CITY['name']}")
plt.tight_layout()
plt.savefig("../output/density_chart.png", dpi=150)
plt.show()

pandas Essentials Reference

OperationCodeReturns
Load CSVpd.read_csv(path)DataFrame
Shapedf.shape(rows, cols) tuple
Column typesdf.dtypesSeries of dtypes
Summary statsdf.describe()count, mean, std, min, max
Filter rowsdf[df["col"] > val]Filtered DataFrame
New columndf["new"] = exprModified in place
Group + aggregatedf.groupby("col").agg(...)Aggregated DataFrame
Merge tablespd.merge(a, b, on="key")Joined DataFrame
Apply functiondf["col"].apply(fn)Transformed Series
Sortdf.sort_values("col")Sorted DataFrame
Drop NaNdf.dropna(subset=["col"])Cleaned DataFrame
Save CSVdf.to_csv(path, index=False)File on disk
Module 1 Check-In
Question 1 of 3
Why do we store city parameters in a dictionary rather than as separate variables?
Question 2 of 3
You have a DataFrame with 400,000 building records. You need the average footprint per kecamatan. What's the most efficient approach?
Question 3 of 3
What does df["density"] = df["population"] / df["area_km2"] do that a for-loop doesn't?
02

Acquiring City Data

Download real open data — administrative boundaries, buildings, roads, and census statistics — for Semarang using APIs, web scraping, and bulk downloads.

M0 Setup M1 Python M2 Data M3 GeoPandas

The Indonesian Open Data Landscape

Indonesia has rich — but scattered — geospatial data. The trick is knowing where to look and how to extract it programmatically so your pipeline stays reproducible.

🗺️
OpenStreetMap
Crowd-sourced map: buildings, roads, POIs. Excellent coverage in Javanese cities. API (Overpass) or bulk download (Geofabrik).
📊
BPS Statistics
National statistics bureau. Population by kelurahan, economic indicators, housing census. API available at webapi.bps.go.id.
🌐
InaGeoportal
Government geospatial portal. Admin boundaries, RDTR zoning, land parcels. Manual download (GeoJSON/SHP). Coverage varies.
🛰️
Earth Engine
Satellite imagery: Landsat, Sentinel-2, SRTM DEM. Global coverage, cloud-processed. Used in Module 6.
Prerequisites
  • Module 1 completed: data/city_config.json exists
  • urban-geo environment active

Download Real Data for Your City

1
Load your city config

Create notebooks/02_data_acquisition.ipynb.

python
import json, requests
import pandas as pd
import geopandas as gpd
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())
print(f"Acquiring data for: {CITY['name']}")
2
Download admin boundaries from OSM

OSMnx can download the boundary polygon of any named place in OpenStreetMap.

python
import osmnx as ox

# Download city boundary
boundary = ox.geocode_to_gdf(CITY["osm_query"])
boundary.to_file("../data/raw/admin_boundary.geojson", driver="GeoJSON")
print(f"Boundary area: {boundary.to_crs(epsg=CITY['epsg_utm']).area.iloc[0] / 1e6:.1f} km²")
Expected
Boundary area: ~373.7 km²  (varies by city)
3
Download buildings from Overpass API

The Overpass API lets you query OpenStreetMap for any feature type. We fetch all building footprints within the city's bounding box.

python
# Download building footprints via OSMnx
tags = {"building": True}
buildings = ox.features_from_place(CITY["osm_query"], tags=tags)

# Keep only polygons (discard nodes)
buildings = buildings[buildings.geometry.type.isin(["Polygon", "MultiPolygon"])].copy()

# Calculate footprint area in m²
buildings_utm = buildings.to_crs(epsg=CITY["epsg_utm"])
buildings["footprint_m2"] = buildings_utm.geometry.area.round(1)

# Save
cols = ["name", "building", "building:levels", "footprint_m2", "geometry"]
cols = [c for c in cols if c in buildings.columns]
buildings[cols].to_file("../data/raw/buildings.geojson", driver="GeoJSON")
print(f"Downloaded {len(buildings):,} building footprints")
Expected
Downloaded 45,000–200,000 building footprints (varies by city OSM coverage)
Large cities

For Jakarta or Surabaya, this may take several minutes and produce large files. Consider filtering by a smaller area first (pass bbox instead of place name).

4
Download road network
python
# Street network — we'll analyze this deeply in Module 4
G = ox.graph_from_place(CITY["osm_query"], network_type="drive")
ox.save_graphml(G, "../data/raw/street_network.graphml")

nodes, edges = ox.graph_to_gdfs(G)
print(f"Network: {len(nodes):,} intersections, {len(edges):,} road segments")
5
Fetch BPS population data

BPS provides an API for statistical tables. We fetch population by kecamatan.

python
# BPS API — population by sub-district
# Note: you need a free API key from webapi.bps.go.id
BPS_KEY = "YOUR_BPS_API_KEY"  # Register at webapi.bps.go.id

# Alternative: scrape the BPS website table
url = f"https://semarangkota.bps.go.id/indicator/12/71/1/jumlah-penduduk.html"
try:
    tables = pd.read_html(url)
    population = tables[0]
    population.to_csv("../data/raw/population_bps.csv", index=False)
    print(f"Scraped {len(population)} rows from BPS")
except Exception as e:
    print(f"BPS scraping failed: {e}")
    print("Tip: manually download from the BPS website and save as CSV")
BPS data tips

BPS website structure varies by city. If scraping fails, visit semarangkota.bps.go.id manually, find the population table, and save as CSV to data/raw/population_bps.csv. The pipeline continues either way.

6
Download POIs (points of interest)
python
# Key urban facilities
poi_tags = {
    "amenity": ["hospital", "school", "marketplace"],
    "shop": True,
}
pois = ox.features_from_place(CITY["osm_query"], tags=poi_tags)
pois_points = pois.copy()
pois_points["geometry"] = pois_points.geometry.centroid
pois_points[["name", "amenity", "shop", "geometry"]].dropna(
    subset=["name"]
).to_file("../data/raw/pois.geojson", driver="GeoJSON")
print(f"Downloaded {len(pois_points):,} POIs")
7
Verify your data inventory
python
# Quick inventory check
raw = Path("../data/raw")
for f in sorted(raw.glob("*")):
    size = f.stat().st_size / 1024
    unit = "KB"
    if size > 1024:
        size /= 1024; unit = "MB"
    print(f"  {f.name:<35} {size:>8.1f} {unit}")
Expected files
  admin_boundary.geojson                 45.2 KB
  buildings.geojson                      12.8 MB
  pois.geojson                          285.3 KB
  population_bps.csv                      3.2 KB
  street_network.graphml                  8.5 MB
Artifacts produced → used by Module 3

data/raw/admin_boundary.geojsondata/raw/buildings.geojsondata/raw/street_network.graphmldata/raw/pois.geojsondata/raw/population_bps.csv

Data Acquisition Recipes

Download OSM features by bounding box (faster for large cities)

python
buildings = ox.features_from_bbox(
    bbox=(north, south, east, west),  # use CITY["bbox"] values
    tags={"building": True}
)

Custom Overpass query (advanced filtering)

python
query = f"""
[out:json][timeout:120];
area["name"="{CITY['name']}"]["admin_level"="5"]->.city;
(way["building"="commercial"](area.city););
out body; >; out skel qt;
"""
response = requests.get("https://overpass-api.de/api/interpreter",
                        params={"data": query})

Download Geofabrik Indonesia extract

bash
wget https://download.geofabrik.de/asia/indonesia-latest-free.shp.zip
unzip indonesia-latest-free.shp.zip -d data/raw/geofabrik/

Data Sources Reference

SourceFunction / URLReturnsCoverage
OSMnx geocodeox.geocode_to_gdf(query)City boundary GeoDataFrameAll cities
OSMnx featuresox.features_from_place(query, tags)GeoDataFrame of tagged featuresAll cities
OSMnx graphox.graph_from_place(query)NetworkX MultiDiGraphAll cities
BPS APIwebapi.bps.go.id/v1/api/listJSON statistical tablesAll kabupaten/kota
InaGeoportaltanahair.indonesia.go.idSHP / GeoJSON downloadVaries
Geofabrikdownload.geofabrik.de/asia/indonesiaSHP / PBF bulk extractAll Indonesia
SRTM DEMvia GEE: USGS/SRTMGL1_00330m elevation rasterGlobal
Module 2 Check-In
Question 1 of 3
You want to count how many buildings are within each kecamatan. You have buildings as points and kecamatan as polygons. What operation is needed?
Question 2 of 3
Your downloaded buildings are in EPSG:4326 (GPS degrees) but you need footprint areas in square meters. What must you do first?
Question 3 of 3
Why do we save the city config as JSON rather than hardcoding values in each notebook?
03

Drawing the City

GeoPandas spatial analysis, KDB/KLB zoning compliance, and Uber H3 hexagonal grids — turning raw downloads into structured urban layers for Semarang.

M1 Python M2 Data M3 GeoPandas M4 Network

GeoDataFrame = Spreadsheet + Map

A GeoDataFrame is a pandas DataFrame with a geometry column. Every row has both tabular attributes (name, area, population) and a spatial shape (point, line, polygon). This means you can filter, group, and merge spatially — not just by column values.

📍
Points
Bus stops, building centroids, flood gauges
〰️
Lines
Roads, rivers, power lines, train routes
Polygons
Building footprints, zoning zones, flood zones
🗂️
CRS
Coordinate Reference System — maps Earth's curve onto flat numbers

KDB/KLB: Indonesian Zoning Decoded

Building
Footprint
Land Parcel (100%)The entire lot
KDB (footprint/land)Max ground coverage — e.g. 0.60 = 60%
KDH (green space)Min permeable surface
KLB (floor area/land)Total floors ÷ plot area

H3: The Universal Join Key

Uber's H3 divides Earth into hexagons at 16 resolution levels. Assign every dataset an H3 cell ID and you can join anything to anything — no complex spatial intersection needed.

ResolutionAvg Hex AreaBest For
7~5.16 km²District-level summaries
8~0.74 km²Neighborhood / OD flows
9~0.11 km²Block-level traffic analysis
10~0.015 km²Individual parcel scale
Prerequisites
  • Module 2 completed: data/raw/buildings.geojson and data/raw/admin_boundary.geojson exist
  • data/city_config.json available

Spatial Analysis & H3 Indexing

1
Load and inspect spatial data

Create notebooks/03_geopandas_h3.ipynb.

python
import json
import geopandas as gpd
import pandas as pd
import h3
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())

# Load Module 2 outputs
buildings = gpd.read_file("../data/raw/buildings.geojson")
boundary = gpd.read_file("../data/raw/admin_boundary.geojson")

print(f"Buildings: {len(buildings):,} features, CRS: {buildings.crs}")
print(f"Boundary CRS: {boundary.crs}")
2
Reproject and compute building metrics
python
# Reproject to UTM for metric calculations
buildings_utm = buildings.to_crs(epsg=CITY["epsg_utm"])

# Building footprint area (m²)
buildings["footprint_m2"] = buildings_utm.geometry.area.round(1)

# Estimate floors if available from OSM tags
if "building:levels" in buildings.columns:
    buildings["floors"] = pd.to_numeric(
        buildings["building:levels"], errors="coerce"
    ).fillna(1).astype(int)
else:
    buildings["floors"] = 1

buildings["floor_area_m2"] = buildings["footprint_m2"] * buildings["floors"]

print(buildings[["footprint_m2", "floors", "floor_area_m2"]].describe().round(1))
3
Assign H3 cells to every building
python
# Convert building centroids to H3 cells at resolution 9
H3_RES = 9

centroids = buildings.geometry.centroid.to_crs(epsg=4326)
buildings["h3_cell"] = [
    h3.latlng_to_cell(pt.y, pt.x, H3_RES) for pt in centroids
]

print(f"Unique H3 cells: {buildings['h3_cell'].nunique():,}")
print(f"Sample cell: {buildings['h3_cell'].iloc[0]}")
Expected
Unique H3 cells: ~1,200–3,000
Sample cell: 89600c1a2ffffff
4
Aggregate building metrics per H3 cell
python
# Aggregate per hexagon
h3_stats = buildings.groupby("h3_cell").agg(
    building_count=("footprint_m2", "count"),
    total_footprint=("footprint_m2", "sum"),
    avg_footprint=("footprint_m2", "mean"),
    max_floors=("floors", "max"),
    total_floor_area=("floor_area_m2", "sum"),
).round(1)

# Create H3 hexagon geometries for mapping
def h3_to_polygon(cell_id):
    boundary = h3.cell_to_boundary(cell_id)
    return Polygon([(lng, lat) for lat, lng in boundary])

h3_grid = gpd.GeoDataFrame(
    h3_stats,
    geometry=[h3_to_polygon(c) for c in h3_stats.index],
    crs="EPSG:4326"
)
h3_grid.index.name = "h3_cell"

print(f"H3 grid: {len(h3_grid)} hexagons")
print(h3_grid.head())
5
Simulate KDB compliance check

Without official parcel boundaries, we approximate KDB at the H3 cell level: total building footprint / hexagon area.

python
# H3 res-9 hex area in m² (approximate)
HEX_AREA_M2 = 105000  # ~0.105 km²

h3_grid["kdb_approx"] = (h3_grid["total_footprint"] / HEX_AREA_M2).round(3)
h3_grid["density_class"] = pd.cut(
    h3_grid["kdb_approx"],
    bins=[0, 0.1, 0.3, 0.5, 1.0, float("inf")],
    labels=["sparse", "low", "medium", "high", "very_high"]
)

print(h3_grid["density_class"].value_counts())
6
Plot and save outputs
python
# Map: building density by H3 cell
fig, ax = plt.subplots(figsize=(12, 10))
boundary.to_crs(epsg=4326).boundary.plot(ax=ax, color="black", linewidth=1)
h3_grid.plot(column="kdb_approx", cmap="YlOrRd", legend=True,
             ax=ax, alpha=0.7, edgecolor="grey", linewidth=0.2)
ax.set_title(f"Building Density (KDB approx) — {CITY['name']}")
plt.savefig("../output/h3_building_density.png", dpi=150, bbox_inches="tight")
plt.show()

# Save processed data
buildings.to_file("../data/processed/buildings_processed.gpkg", driver="GPKG")
h3_grid.reset_index().to_file("../data/processed/h3_grid.gpkg", driver="GPKG")
print("Saved: buildings_processed.gpkg, h3_grid.gpkg")
Artifacts produced → used by Modules 4, 5, 7

data/processed/buildings_processed.gpkgdata/processed/h3_grid.gpkgoutput/h3_building_density.png

GeoPandas & H3 Recipes

Reproject to local UTM

python
gdf_utm = gdf.to_crs(epsg=CITY["epsg_utm"])  # meters for area/distance

Spatial join: points in polygons

python
joined = gpd.sjoin(buildings, zoning, how="left", predicate="within")

Buffer around coastline

python
coast_buffer = coastline.to_crs(epsg=CITY["epsg_utm"]).buffer(500)  # 500m

H3: k-ring smoothing

python
# Average metric over a cell and its 6 neighbors
def kring_smooth(row, df, col, k=1):
    neighbors = h3.grid_disk(row.name, k)
    vals = df.loc[df.index.isin(neighbors), col]
    return vals.mean()

h3_grid["density_smooth"] = h3_grid.apply(
    kring_smooth, axis=1, df=h3_grid, col="building_count"
)

GeoPandas & H3 Reference

OperationCode
Read filegpd.read_file("path.gpkg")
Reprojectgdf.to_crs(epsg=32749)
Area (m²)gdf_utm.geometry.area
Centroidgdf.geometry.centroid
Buffergdf_utm.buffer(500) (meters)
Spatial joingpd.sjoin(left, right, predicate="within")
Overlaygpd.overlay(a, b, how="intersection")
Save GeoPackagegdf.to_file("out.gpkg", driver="GPKG")
H3: point to cellh3.latlng_to_cell(lat, lon, res)
H3: cell boundaryh3.cell_to_boundary(cell_id)
H3: neighborsh3.grid_disk(cell_id, k)
H3: resolutionh3.get_resolution(cell_id)

KDB Calculator

KDB Actual
KLB Actual
Green Space %
Module 3 Check-In
Question 1 of 3
A K-1 commercial zone has KDB max 0.80. A building occupies 500 m² on a 600 m² lot. Is it compliant?
Question 2 of 3
Traffic data and flood risk data have different polygon boundaries. How do you combine them?
Question 3 of 3
For neighborhood-block-scale analysis (~100-300m), which H3 resolution?
04

City as a Network

OSMnx and NetworkX turn street maps into mathematical graphs — revealing chokepoints, accessibility gaps, and optimal facility locations in Semarang.

M2 Data M3 GeoPandas M4 Network M5 PostGIS

Streets as Social Networks

Think of your city's street network like a social network. Intersections are people (nodes). Roads are friendships (edges). Some intersections are on the shortest path between everyone — they're the bottlenecks. Others sit in dead-end corners.

HIGH BC bottleneck
🔴
Betweenness Centrality
How often does traffic pass through this intersection? High = critical chokepoint.
🔵
Closeness Centrality
How quickly can you reach everywhere else? High = well-connected center.
⏱️
Isochrones
All points reachable within N minutes — service coverage halos.
Prerequisites
  • Module 2: data/raw/street_network.graphml exists
  • Module 3: data/processed/h3_grid.gpkg exists

Network Analysis for Your City

1
Load the street network

Create notebooks/04_network_analysis.ipynb.

python
import json
import osmnx as ox
import networkx as nx
import geopandas as gpd
import matplotlib.pyplot as plt
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())

# Load the graph saved in Module 2
G = ox.load_graphml("../data/raw/street_network.graphml")
G_undir = ox.convert.to_undirected(G)

nodes, edges = ox.graph_to_gdfs(G_undir)
print(f"Network: {len(nodes):,} nodes, {len(edges):,} edges")
2
Calculate centrality metrics

This takes 2-10 minutes depending on city size. For large networks, sample a subgraph first.

python
# Betweenness centrality (may take several minutes for large cities)
print("Calculating betweenness centrality...")
bc = nx.betweenness_centrality(G_undir, weight="length")

print("Calculating closeness centrality...")
cc = nx.closeness_centrality(G_undir, distance="length")

# Attach to nodes GeoDataFrame
nodes["betweenness"] = nodes.index.map(bc)
nodes["closeness"] = nodes.index.map(cc)

print(f"Top betweenness: {nodes['betweenness'].max():.6f}")
print(f"Top closeness: {nodes['closeness'].max():.6f}")
Large cities (Jakarta, Surabaya)

For 50k+ nodes, use nx.betweenness_centrality(G, k=500) to sample 500 random source nodes. This gives a good approximation in seconds instead of hours.

3
Generate isochrones

An isochrone shows all locations reachable within a time limit from a starting point. We'll generate 5, 10, and 15-minute walking isochrones from the city center.

python
# Walk network for isochrones
G_walk = ox.graph_from_place(CITY["osm_query"], network_type="walk")

# Find nearest node to city center
center_node = ox.nearest_nodes(G_walk, CITY["lon"], CITY["lat"])

# Walking speed ~ 4.5 km/h = 1.25 m/s
WALK_SPEED = 1.25
for u, v, data in G_walk.edges(data=True):
    data["time"] = data["length"] / WALK_SPEED

# Isochrone polygons
isochrones = []
for minutes in [5, 10, 15]:
    subgraph = nx.ego_graph(G_walk, center_node,
                            radius=minutes * 60, distance="time")
    iso_nodes = ox.graph_to_gdfs(subgraph, edges=False)
    hull = iso_nodes.unary_union.convex_hull
    isochrones.append({"minutes": minutes, "geometry": hull})

iso_gdf = gpd.GeoDataFrame(isochrones, crs="EPSG:4326")
iso_gdf.to_file("../data/processed/isochrones.gpkg", driver="GPKG")
print(f"Generated {len(iso_gdf)} isochrones")
4
Plot centrality map and save
python
# Betweenness centrality map
fig, ax = plt.subplots(figsize=(14, 10))
edges_gdf = ox.graph_to_gdfs(G_undir, nodes=False)
edges_gdf.plot(ax=ax, color="#ddd", linewidth=0.3)
nodes.plot(ax=ax, column="betweenness", cmap="hot_r",
           markersize=nodes["betweenness"] * 50000,
           alpha=0.7, legend=True)
ax.set_title(f"Betweenness Centrality — {CITY['name']}")
ax.set_axis_off()
plt.savefig("../output/betweenness_map.png", dpi=150, bbox_inches="tight")
plt.show()

# Save centrality nodes
nodes.to_file("../data/processed/centrality_nodes.gpkg", driver="GPKG")
print("Saved: centrality_nodes.gpkg")
Artifacts produced → used by Modules 5, 7

data/processed/centrality_nodes.gpkgdata/processed/isochrones.gpkgoutput/betweenness_map.png

Network Recipes

Shortest path between two points

python
orig = ox.nearest_nodes(G, lon1, lat1)
dest = ox.nearest_nodes(G, lon2, lat2)
route = ox.shortest_path(G, orig, dest, weight="length")
ox.plot_graph_route(G, route)

Network stats summary

python
stats = ox.basic_stats(G, area=boundary_area_m2)
print(f"Street density: {stats['street_density_km']:.1f} km/km²")
print(f"Avg edge length: {stats['street_length_avg']:.0f} m")

Sample centrality for large networks

python
bc_approx = nx.betweenness_centrality(G, k=500, weight="length")  # fast approximation

Network Analysis Reference

FunctionPurpose
ox.graph_from_place(q)Download street graph for a named place
ox.graph_from_bbox(n,s,e,w)Download by bounding box
ox.save_graphml(G, path)Save graph to disk
ox.load_graphml(path)Reload saved graph
ox.graph_to_gdfs(G)Convert to GeoDataFrames (nodes, edges)
ox.nearest_nodes(G, lon, lat)Find nearest node to a coordinate
ox.shortest_path(G, o, d)Find shortest path between two nodes
nx.betweenness_centrality(G)How often a node lies on shortest paths
nx.closeness_centrality(G)Average inverse distance to all other nodes
nx.ego_graph(G, n, radius=r)Subgraph within radius of node n
Module 4 Check-In
Question 1 of 3
You need to place a new bus terminal. Should you optimize for betweenness centrality or closeness centrality?
Question 2 of 3
One ring road has betweenness 10x higher than any other road. What does this mean?
Question 3 of 3
How do you measure 15-minute walkability for residential areas?
05

The City's Long Memory

PostGIS turns PostgreSQL into a spatial database — permanent storage for millions of records with lightning-fast spatial queries via SQL.

M3 GeoPandas M4 Network M5 PostGIS M6 GEE

Why a Database Instead of Files?

GeoPandas loads everything into RAM. Fine for 50,000 buildings. But when you have 400,000 buildings across 30 years, multiple team members, and queries like "find every building within 500m of the coast that violates KDB" — you need a database.

📦
Persistent Storage
Millions of records on disk. No reloading CSVs on every restart.
GIST Index
Spatial index makes "find within polygon" run in milliseconds.
🔗
Multi-User
Entire team queries the same database simultaneously.
🗄️
Materialized Views
Pre-compute expensive spatial joins once, refresh on demand.

Urban Scaling Laws

When a city doubles in population, some metrics scale sublinearly (infrastructure — economies of scale), linearly (housing), or superlinearly (innovation, congestion). The exponent β reveals the pattern.

β < 1
Sublinear
Roads, gas stations, power cables
β ≈ 1
Linear
Housing, water use, total jobs
β > 1
Superlinear
Innovation, wealth, congestion, crime
Prerequisites
  • Module 0: PostGIS database urban_semarang created
  • Modules 2-4: processed GeoPackage files in data/processed/

Load Everything Into PostGIS

1
Load spatial data using ogr2ogr

ogr2ogr (part of GDAL) bulk-loads spatial files into PostGIS tables.

bash
# Load buildings
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/buildings_processed.gpkg \
  -nln buildings -overwrite -lco GEOMETRY_NAME=geom

# Load H3 grid
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/h3_grid.gpkg \
  -nln h3_grid -overwrite -lco GEOMETRY_NAME=geom

# Load centrality nodes
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/centrality_nodes.gpkg \
  -nln centrality_nodes -overwrite -lco GEOMETRY_NAME=geom

# Load admin boundary
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/raw/admin_boundary.geojson \
  -nln admin_boundary -overwrite -lco GEOMETRY_NAME=geom
2
Create spatial indexes
sql
-- Connect: psql -d urban_semarang

CREATE INDEX IF NOT EXISTS idx_buildings_geom ON buildings USING GIST(geom);
CREATE INDEX IF NOT EXISTS idx_h3_grid_geom ON h3_grid USING GIST(geom);
CREATE INDEX IF NOT EXISTS idx_centrality_geom ON centrality_nodes USING GIST(geom);

-- Verify tables loaded
SELECT tablename, pg_size_pretty(pg_total_relation_size(tablename::text))
FROM pg_tables
WHERE schemaname = 'public';
Expected
 tablename        | pg_size_pretty
------------------+----------------
 buildings        | 45 MB
 h3_grid          | 512 KB
 centrality_nodes | 8 MB
 admin_boundary   | 64 KB
3
Spatial queries: buildings near the coast
sql
-- Find buildings within 500m of a coastal point
SELECT count(*) AS coastal_buildings,
       round(avg(footprint_m2)::numeric, 1) AS avg_footprint
FROM buildings
WHERE ST_DWithin(
    geom::geography,
    ST_SetSRID(ST_MakePoint(110.42, -6.97), 4326)::geography,
    500  -- meters
);
4
Building density per H3 cell (SQL version)
sql
-- Join buildings to H3 hexagons and aggregate
CREATE MATERIALIZED VIEW h3_building_summary AS
SELECT
    h.h3_cell,
    h.geom,
    count(b.*) AS building_count,
    coalesce(sum(b.footprint_m2), 0) AS total_footprint,
    coalesce(avg(b.footprint_m2), 0)::numeric(10,1) AS avg_footprint
FROM h3_grid h
LEFT JOIN buildings b ON ST_Contains(h.geom, b.geom)
GROUP BY h.h3_cell, h.geom;

CREATE INDEX idx_h3_summary_geom ON h3_building_summary USING GIST(geom);

-- Check result
SELECT count(*) AS hexagons,
       sum(building_count) AS total_buildings
FROM h3_building_summary;
5
Query from Python (for the dashboard)

Create notebooks/05_postgis.ipynb and test querying from Python.

python
import geopandas as gpd
from sqlalchemy import create_engine

engine = create_engine("postgresql://postgres@localhost/urban_semarang")

# Load H3 building summary as GeoDataFrame
h3_summary = gpd.read_postgis(
    "SELECT * FROM h3_building_summary",
    engine, geom_col="geom"
)
print(f"Loaded {len(h3_summary)} hexagons from PostGIS")
print(h3_summary[["h3_cell", "building_count", "total_footprint"]].head())
Artifacts produced → used by Module 7

PostGIS database urban_semarang with tables: buildings, h3_grid, centrality_nodes, admin_boundary • Materialized view: h3_building_summary

PostGIS Recipes

Find all features within a polygon

sql
SELECT * FROM buildings WHERE ST_Within(geom, (SELECT geom FROM zoning WHERE zone_code = 'K-1'));

Buffer + count (coastal exposure)

sql
SELECT count(*) FROM buildings
WHERE ST_DWithin(geom::geography,
  (SELECT ST_Boundary(geom)::geography FROM admin_boundary LIMIT 1), 1000);

Refresh a materialized view

sql
REFRESH MATERIALIZED VIEW h3_building_summary;

Urban scaling: β exponent via SQL

sql
SELECT REGR_SLOPE(ln(built_area), ln(population)) AS beta
FROM city_stats
WHERE population > 100000;

PostGIS Function Reference

FunctionPurpose
ST_Contains(a, b)True if geometry a fully contains b
ST_Within(a, b)True if a is fully within b
ST_Intersects(a, b)True if a and b share any space
ST_DWithin(a, b, d)True if a and b are within distance d
ST_Buffer(geom, d)Create buffer zone of distance d around geometry
ST_Area(geom)Area in CRS units (use geography for m²)
ST_Distance(a, b)Distance between geometries
ST_Transform(geom, srid)Reproject to a different CRS
ST_SetSRID(geom, srid)Assign CRS without transforming coordinates
ST_MakePoint(lon, lat)Create point from coordinates
Module 5 Check-In
Question 1 of 2
2M building records spanning 20 years. "Which kelurahan had the most green-to-built conversion?" GeoPandas or PostGIS?
Question 2 of 2
What does ST_Contains(z.geom, b.geom) check?
06

Eyes in the Sky

Google Earth Engine processes petabytes of satellite imagery in the cloud — land-use classification, change detection, and coastal risk modeling for Semarang.

M4 Network M5 PostGIS M6 GEE M7 Dashboard

A DVR for the Planet

GEE gives you access to decades of Landsat and Sentinel imagery, processed on Google's servers. The course uses it to answer two questions: "How has land use changed?" and "Which areas face the highest flood risk?"

1
Filter Image Collection
Select Landsat 8 scenes from 2023, over your city, with <20% cloud cover.
2
Compute Spectral Indices
NDVI (vegetation), NDBI (built-up), MNDWI (water) — band-math that reveals what's on the ground.
3
Classify Land Use
Train a Random Forest on labeled samples, classify every pixel.
4
Detect Change
Compare 2015 vs 2023 classifications pixel-by-pixel.

Risk Layer Mixer

Toggle risk dimensions to see how the composite score changes.

w = 0.40
w = 0.35
w = 0.25
Composite Risk100%
0.73

Simulated score for a low-elevation coastal kelurahan

Prerequisites
  • Module 0: earthengine authenticate completed
  • Module 2: data/raw/admin_boundary.geojson for city extent

Satellite Analysis & Risk Modeling

1
Initialize GEE and define study area

Create notebooks/06_earth_engine.ipynb.

python
import json, ee
import geemap
from pathlib import Path

ee.Initialize()
CITY = json.loads(Path("../data/city_config.json").read_text())

# Define study area from bounding box
bbox = CITY["bbox"]
aoi = ee.Geometry.Rectangle([bbox[0], bbox[1], bbox[2], bbox[3]])

print(f"Study area: {CITY['name']}")
print(f"Bounding box: {bbox}")
2
Create cloud-free composites for two years
python
def get_composite(year, aoi):
    """Cloud-masked median composite from Landsat 8/9."""
    collection = (ee.ImageCollection("LANDSAT/LC08/C02/T1_L2")
        .merge(ee.ImageCollection("LANDSAT/LC09/C02/T1_L2"))
        .filterBounds(aoi)
        .filterDate(f"{year}-01-01", f"{year}-12-31")
        .filter(ee.Filter.lt("CLOUD_COVER", 20)))

    def mask_clouds(img):
        qa = img.select("QA_PIXEL")
        mask = qa.bitwiseAnd(1 << 3).eq(0).And(qa.bitwiseAnd(1 << 4).eq(0))
        return img.updateMask(mask)

    return collection.map(mask_clouds).median().clip(aoi)

composite_2015 = get_composite(2015, aoi)
composite_2023 = get_composite(2023, aoi)
print("Composites ready")
3
Compute spectral indices
python
def add_indices(img):
    """Add NDVI, NDBI, MNDWI to a Landsat image."""
    ndvi = img.normalizedDifference(["SR_B5", "SR_B4"]).rename("NDVI")
    ndbi = img.normalizedDifference(["SR_B6", "SR_B5"]).rename("NDBI")
    mndwi = img.normalizedDifference(["SR_B3", "SR_B6"]).rename("MNDWI")
    return img.addBands([ndvi, ndbi, mndwi])

img_2015 = add_indices(composite_2015)
img_2023 = add_indices(composite_2023)

# Visualize NDVI
Map = geemap.Map(center=[CITY["lat"], CITY["lon"]], zoom=11)
Map.addLayer(img_2023.select("NDVI"), {"min": -0.2, "max": 0.8, "palette": ["red", "yellow", "green"]}, "NDVI 2023")
Map
4
Classify land use with Random Forest
python
# Training points (replace with your own labeled samples)
# Classes: 0=water, 1=vegetation, 2=built-up, 3=bare_soil
bands = ["SR_B2","SR_B3","SR_B4","SR_B5","SR_B6","SR_B7","NDVI","NDBI","MNDWI"]

# Use a GEE training dataset or create your own FeatureCollection
# For reproducibility, use ESA WorldCover as pseudo-labels
worldcover = ee.Image("ESA/WorldCover/v200").clip(aoi)

# Sample training points from WorldCover
training = img_2023.select(bands).addBands(worldcover.rename("label")).sample(
    region=aoi, scale=30, numPixels=5000, seed=42
)

# Train Random Forest
classifier = ee.Classifier.smileRandomForest(50).train(
    features=training, classProperty="label", inputProperties=bands
)

# Classify both years
classified_2015 = img_2015.select(bands).classify(classifier)
classified_2023 = img_2023.select(bands).classify(classifier)
print("Classification complete")
5
Build the risk composite
python
# Elevation from SRTM
dem = ee.Image("USGS/SRTMGL1_003").clip(aoi)
elevation = dem.select("elevation")
slope = ee.Terrain.slope(dem)

# Normalize to 0-1 (higher value = more hazardous)
h_elev = elevation.multiply(-1).add(50).divide(50).clamp(0, 1)  # low elevation = high risk
h_slope = slope.multiply(-1).add(30).divide(30).clamp(0, 1)

# Built-up density as exposure proxy
ndbi_norm = img_2023.select("NDBI").add(1).divide(2).clamp(0, 1)

# Green cover as vulnerability proxy (less green = more vulnerable)
ndvi_inv = img_2023.select("NDVI").multiply(-1).add(1).divide(2).clamp(0, 1)

# Composite
risk = (h_elev.multiply(0.40)
    .add(ndbi_norm.multiply(0.35))
    .add(ndvi_inv.multiply(0.25)))

# Visualize
Map = geemap.Map(center=[CITY["lat"], CITY["lon"]], zoom=11)
Map.addLayer(risk, {"min": 0, "max": 1, "palette": ["green", "yellow", "red"]}, "Risk")
Map
6
Export results for the dashboard
python
# Export risk raster to Google Drive (then download to output/)
task = ee.batch.Export.image.toDrive(
    image=risk,
    description=f"risk_composite_{CITY['name'].lower()}",
    folder="urban_analytics",
    region=aoi,
    scale=30,
    maxPixels=1e9
)
task.start()
print(f"Export started. Check Google Drive > urban_analytics/")

# Also export the change detection
change = classified_2023.subtract(classified_2015).rename("change")
task2 = ee.batch.Export.image.toDrive(
    image=change, description=f"land_change_{CITY['name'].lower()}",
    folder="urban_analytics", region=aoi, scale=30, maxPixels=1e9
)
task2.start()
print("Change detection export started")
After export completes

Download the GeoTIFFs from Google Drive to output/risk_composite.tif and output/land_change.tif. These feed into the dashboard in Module 7.

Artifacts produced → used by Module 7

output/risk_composite.tifoutput/land_change.tif (via Google Drive)

GEE Recipes

NDVI time series for a point

python
point = ee.Geometry.Point([CITY["lon"], CITY["lat"]])
ts = (ee.ImageCollection("LANDSAT/LC08/C02/T1_L2")
    .filterBounds(point).select("SR_B5", "SR_B4")
    .map(lambda img: img.normalizedDifference(["SR_B5", "SR_B4"]).rename("NDVI")
         .set("date", img.date().format())))

Bathtub flood model

python
flood_level = 2  # meters above sea level
flooded = dem.select("elevation").lte(flood_level)
Map.addLayer(flooded.selfMask(), {"palette": ["blue"]}, f"Flood {flood_level}m")

Export to local file (via geemap)

python
geemap.ee_export_image(risk, filename="../output/risk.tif", scale=30, region=aoi)

GEE Reference

IndexFormulaMeasures
NDVI(NIR − Red) / (NIR + Red)Vegetation density/health
NDBI(SWIR1 − NIR) / (SWIR1 + NIR)Built-up/impervious surface
MNDWI(Green − SWIR1) / (Green + SWIR1)Water bodies
CollectionIDResolution
Landsat 8/9 SRLANDSAT/LC08/C02/T1_L230m, 16-day
Sentinel-2 SRCOPERNICUS/S2_SR_HARMONIZED10m, 5-day
SRTM DEMUSGS/SRTMGL1_00330m, static
ESA WorldCoverESA/WorldCover/v20010m, 2021
Module 6 Check-In
Question 1 of 2
How do you map vegetation loss between 2015 and 2023?
Question 2 of 2
What does NDVI measure?
07

The Living Dashboard

Streamlit, pydeck, and HERE Traffic — assembling everything into an interactive digital twin of Semarang that decision-makers can explore.

M3 GeoPandas M5 PostGIS M6 GEE M7 Dashboard

The City's Shadow

A digital twin is a living, data-driven mirror of a physical system. The dashboard brings together everything from the previous 6 modules into one explorable interface.

🖥️
Streamlit
Python scripts become web apps. No HTML/CSS needed — st.map(), st.slider().
🗺️
pydeck / deck.gl
GPU-accelerated 3D maps. H3HexagonLayer, ArcLayer for OD flows.
🚗
HERE Traffic API
Real-time traffic speed, incidents, routing. Freemium (250k calls/mo).
🏍️
Ojol Flows
Ride-hailing OD flows as arc layers — demand corridors and transit gaps.

Dashboard Architecture

1
Data Layer: PostGIS + GEE exports
Buildings, zoning, risk rasters, census data in PostgreSQL.
psycopg2
2
Live Feed: HERE Traffic API
Real-time traffic flow for major corridors, assigned to H3 cells.
requests
3
H3 Aggregation
All metrics unified on a single hex grid. One ID joins everything.
h3-py
4
Visualization: pydeck 3D Layers
H3HexagonLayer for density, ArcLayer for OD flows.
pydeck
5
Interaction: Streamlit Widgets
Sliders, date pickers, zone selectors. Live map updates.
streamlit
Prerequisites
  • Module 5: PostGIS database urban_semarang with all tables loaded
  • Module 6: output/risk_composite.tif downloaded from GEE
  • Module 3: data/processed/h3_grid.gpkg

Build the Dashboard

1
Create the Streamlit app file

Create scripts/dashboard.py.

python
# scripts/dashboard.py
import json
import streamlit as st
import pandas as pd
import geopandas as gpd
import pydeck as pdk
import h3
from sqlalchemy import create_engine
from pathlib import Path

# ─── Config ───
CITY = json.loads(Path("data/city_config.json").read_text())
DB_URL = f"postgresql://postgres@localhost/urban_{CITY['name'].lower()}"
engine = create_engine(DB_URL)

st.set_page_config(page_title=f"{CITY['name']} Urban Dashboard", layout="wide")
st.title(f"🏙️ {CITY['name']} Urban Analytics Dashboard")
2
Load H3 data from PostGIS
python
# ─── Data Loading (cached) ───
@st.cache_data
def load_h3_data():
    df = gpd.read_postgis(
        "SELECT * FROM h3_building_summary",
        engine, geom_col="geom"
    )
    # Convert H3 geometry to lat/lon for pydeck
    df["lat"] = df.geometry.centroid.y
    df["lon"] = df.geometry.centroid.x
    return df

h3_data = load_h3_data()
st.sidebar.metric("Total Hexagons", len(h3_data))
st.sidebar.metric("Total Buildings", f"{h3_data['building_count'].sum():,}")
3
Add the H3 hexagon map
python
# ─── H3 Building Density Map ───
st.subheader("Building Density by H3 Cell")

metric = st.selectbox("Color by", ["building_count", "total_footprint", "avg_footprint"])

layer = pdk.Layer(
    "H3HexagonLayer",
    data=h3_data,
    get_hexagon="h3_cell",
    get_fill_color=f"[255, (1 - {metric} / {h3_data[metric].quantile(0.95)}) * 255, 0, 160]",
    get_elevation=metric,
    elevation_scale=2,
    extruded=True,
    pickable=True,
)

view = pdk.ViewState(
    latitude=CITY["lat"], longitude=CITY["lon"],
    zoom=11, pitch=45
)

st.pydeck_chart(pdk.Deck(layers=[layer], initial_view_state=view,
    tooltip={"text": f"H3: {{h3_cell}}\\n{metric}: {{{metric}}}"}))
4
Add HERE Traffic layer (optional)
python
# ─── HERE Traffic (optional — needs API key) ───
import requests

HERE_KEY = st.sidebar.text_input("HERE API Key", type="password")

if HERE_KEY:
    bbox_str = f"{CITY['bbox'][1]},{CITY['bbox'][0]},{CITY['bbox'][3]},{CITY['bbox'][2]}"
    url = f"https://data.traffic.hereapi.com/v7/flow?in=bbox:{bbox_str}&apiKey={HERE_KEY}"

    @st.cache_data(ttl=300)
    def fetch_traffic():
        resp = requests.get(url)
        if resp.status_code == 200:
            results = resp.json().get("results", [])
            rows = []
            for r in results:
                loc = r.get("location", {})
                flow = r.get("currentFlow", {})
                if "shape" in loc and "speed" in flow:
                    coords = loc["shape"]["links"][0]["points"]
                    mid = coords[len(coords)//2]
                    rows.append({
                        "lat": mid["lat"], "lon": mid["lng"],
                        "speed": flow["speed"],
                        "jam_factor": flow.get("jamFactor", 0),
                    })
            return pd.DataFrame(rows)
        return pd.DataFrame()

    traffic = fetch_traffic()
    if not traffic.empty:
        st.subheader(f"Live Traffic — {len(traffic)} segments")
        traffic["h3_cell"] = [h3.latlng_to_cell(r.lat, r.lon, 9) for _, r in traffic.iterrows()]
        st.dataframe(traffic.head(10))
5
Run the dashboard
bash
cd urban_analytics
streamlit run scripts/dashboard.py
Expected
  Local URL: http://localhost:8501
  Network URL: http://192.168.x.x:8501
Final output

Running Streamlit dashboard at localhost:8501 integrating PostGIS, H3, GEE risk maps, and (optionally) live HERE Traffic data for Semarang.

Dashboard Recipes

pydeck ArcLayer for OD flows

python
arc_layer = pdk.Layer("ArcLayer", data=od_df,
    get_source_position=["origin_lon", "origin_lat"],
    get_target_position=["dest_lon", "dest_lat"],
    get_source_color=[0, 128, 200],
    get_target_color=[200, 0, 80],
    get_width="trip_count", width_scale=0.5)

Streamlit sidebar filters

python
min_buildings = st.sidebar.slider("Min buildings per hex", 0, 500, 10)
filtered = h3_data[h3_data["building_count"] >= min_buildings]

Cache PostGIS queries

python
@st.cache_data(ttl=600)  # cache for 10 minutes
def query_buildings(zone_code):
    return gpd.read_postgis(f"SELECT * FROM buildings WHERE zone = '{zone_code}'", engine)

Streamlit & pydeck Reference

ComponentCode
Textst.title() / st.header() / st.write()
DataFramest.dataframe(df)
Chartst.bar_chart(df) / st.line_chart(df)
Mapst.pydeck_chart(deck)
Sliderst.slider("Label", min, max, default)
Selectst.selectbox("Label", options)
Sidebarst.sidebar.slider(...)
Caching@st.cache_data(ttl=N)
H3 Layerpdk.Layer("H3HexagonLayer", ...)
Arc Layerpdk.Layer("ArcLayer", ...)
Scatter Layerpdk.Layer("ScatterplotLayer", ...)

HERE Traffic API

EndpointURL PatternReturns
Flow/v7/flow?in=bbox:S,W,N,ESpeed, jamFactor per road segment
Incidents/v7/incidents?in=bbox:S,W,N,EAccidents, road closures, construction
Final Check-In
Question 1 of 2
Which H3 hexagons have both high flood risk AND high ojol demand? Which layers do you combine?
Question 2 of 2
Why is H3 the common spatial index across all dashboard layers?
🏆

Course Complete

You've built an end-to-end urban analytics pipeline — from raw open data to a living dashboard — reproducible for any Indonesian city. Change the city in city_config.json and re-run every notebook. The same code works everywhere.