Urban Geospatial Analytics — End-to-End Course for Indonesian Cities

00

Setup & Choose Your City

Configure your environment and select any Indonesian city. Every code example, dataset download, and analysis in this course will adapt to your choice.

Select Your Study City

All tutorials, coordinates, EPSG codes, and data downloads will automatically adapt.

Working with: Semarang — Jawa Tengah

📦 Download Starter Kit for Semarang

Everything you need in one ZIP — pre-configured for your selected city. Includes 37 Jupyter Notebooks, 37 source scripts, SQL files, requirements, and a ready-to-use city_config.json.

📦

Complete Course Kit (.zip)

All 76 files: notebooks (.ipynb), scripts (.py/.sql/.sh), city_config.json for Semarang, requirements.txt, and README. Just unzip, install, and run.

⚙️

City Config Only

Just the city_config.json for Semarang — swap this file to re-run the entire pipeline for a different city.

📄

Requirements Only

Python dependency list. Install with pip install -r requirements.txt inside your conda environment.

💻 Quick Start After Download

unzip urban_analytics_semarang.zip
cd urban_analytics
conda create -n urban-geo python=3.11 -y && conda activate urban-geo
pip install -r requirements.txt
jupyter lab

📚 Contents by Module

The ZIP includes these notebooks and scripts:

End-to-End Pipeline

Each module produces artifacts the next module consumes. By the end, you'll have a live dashboard integrating every layer.

M0 Setup→ M1 Python→ M2 Data→ M3 GeoPandas→ M4 Network→ M5 PostGIS→ M6 GEE→ M7 Dashboard

Why This Course Exists

Indonesian cities are growing faster than the tools planners use to understand them. Manual GIS workflows can't keep pace with the scale of land-use change, transport demand, and coastal risk across Java, Sulawesi, Kalimantan, and Sumatra.

This course teaches you to build an automated, reproducible urban analytics pipeline — from raw open data to an interactive dashboard — using only free, open-source tools. Pick any city; the same code works everywhere.

🔄

Reproducible

Change the city name; the entire pipeline re-runs. No manual steps.

🆓

Open Source

Python, PostGIS, GEE, OSM — no license fees, no vendor lock-in.

🇮🇩

Indonesia-First

BPS codes, RDTR zoning, KDB/KLB, ojol transport — built for local context.

📊

Decision-Ready

Ends with a Streamlit dashboard that non-technical stakeholders can explore.

Course Structure: Diátaxis

Every module is organized into four modes, based on the Diátaxis framework:

💡

Understand (Explanation)

The "why" — concepts, mental models, and design decisions. Read this when you want to deepen understanding.

🎯

Follow Along (Tutorial)

Step-by-step hands-on. Each module's tutorial picks up exactly where the last one left off. This is the spine of the course.

📋

Recipes (How-to)

Task-oriented snippets: "How do I reproject a shapefile?" "How do I create an H3 grid?" Quick answers when you know what you need.

📖

Reference

Function signatures, API parameters, EPSG lookup tables. The facts, without narrative.

Step-by-Step: Environment Setup

1

Install Miniconda

Download and install Miniconda from the official site. This gives you the conda package manager without bloat.

bash

# Linux / macOS
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

# Or macOS ARM (Apple Silicon)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
bash Miniconda3-latest-MacOSX-arm64.sh

2

Create the project environment

bash

conda create -n urban-geo python=3.11 -y
conda activate urban-geo

3

Install Python packages

bash

pip install pandas geopandas matplotlib seaborn shapely fiona pyproj \
    osmnx networkx h3 requests beautifulsoup4 \
    psycopg2-binary sqlalchemy geoalchemy2 \
    earthengine-api geemap \
    streamlit pydeck plotly folium

Expected

Successfully installed 50+ packages. No errors.

4

Install PostGIS

PostGIS needs PostgreSQL as a base. Install both:

bash

# Ubuntu/Debian
sudo apt install postgresql postgis -y

# macOS (Homebrew)
brew install postgresql@16 postgis

# Start the service
sudo systemctl start postgresql   # Linux
brew services start postgresql@16 # macOS

5

Create the spatial database

bash

sudo -u postgres createdb urban_semarang
sudo -u postgres psql -d urban_semarang -c "CREATE EXTENSION postgis; CREATE EXTENSION h3;"

Expected

CREATE EXTENSION
CREATE EXTENSION

Note

If h3 extension is not available, install h3-pg from source or via pgxn install h3. The h3 extension is optional — you can compute H3 in Python instead.

6

Create project folder structure

bash

mkdir -p urban_analytics/{data/raw,data/processed,output,notebooks,scripts}
cd urban_analytics

Your folder tree

urban_analytics/
├── data/
│   ├── raw/          # Downloaded shapefiles, CSVs, GeoJSON
│   └── processed/    # Cleaned, reprojected, H3-indexed
├── output/           # GeoPackage, GeoTIFF, exports
├── notebooks/        # Jupyter notebooks (one per module)
└── scripts/          # Reusable .py and .sql files

7

Authenticate Google Earth Engine

bash

earthengine authenticate

Follow the browser prompt to authorize. This is needed for Modules 6-7.

8

Verify everything works

python

import pandas as pd
import geopandas as gpd
import osmnx as ox
import h3
import ee

ee.Initialize()

print("pandas", pd.__version__)
print("geopandas", gpd.__version__)
print("osmnx", ox.__version__)
print("h3", h3.__version__)
print("GEE OK")
print("\n✓ All systems go. Ready for Module 1.")

Expected

pandas 2.x.x
geopandas 1.x.x
osmnx 2.x.x
h3 4.x.x
GEE OK

✓ All systems go. Ready for Module 1.

Artifacts produced

urban_analytics/ folder structure • urban_semarang PostGIS database • urban-geo conda environment

City Configuration Reference

These values update automatically when you change the city selector above.

Parameter	Value
City Name	Semarang
Province	Jawa Tengah
OSMnx Query	`Semarang, Indonesia`
Center Latitude	-6.97
Center Longitude	110.42
UTM EPSG	`EPSG:32749`
UTM Zone	49S
BPS Kota Code	3374
Coastal	Yes
Bounding Box	`[110.28, -7.12, 110.53, -6.88]`

Indonesian Data Sources

Source	Data Types	Coverage	Access
OpenStreetMap / Geofabrik	Roads, buildings, POIs, landuse	All cities	Free, bulk download or API
BPS (bps.go.id)	Population, economics, housing census	All kabupaten/kota	Free, web + API
InaGeoportal	Admin boundaries, RDTR zoning, land parcels	Varies by city	Free, download
Google Earth Engine	Landsat, Sentinel-2, SRTM DEM, land cover	Global	Free (research/education)
HERE Traffic API	Real-time traffic flow, incidents, routing	Major roads globally	Freemium (250k calls/mo free)
BMKG	Weather, climate, earthquake, tsunami	National	Free, web + API

01

Python Foundations

Core Python, pandas, and matplotlib — building the data-handling toolkit you'll use in every subsequent module.

M0 Setup→ M1 Python→ M2 Data→ M3 GeoPandas

Why Python for Urban Analytics?

Python is the glue language. It connects spatial libraries (GeoPandas), network analysis (NetworkX), databases (psycopg2), satellite imagery (GEE API), and dashboards (Streamlit) in a single workflow. No other language has this breadth.

In this course, Python isn't the destination — it's the vehicle. You'll learn just enough to drive it confidently through 7 modules of increasingly powerful geospatial tools.

Three Concepts That Matter Most

1

DataFrames are the universal container

Every dataset — census, buildings, traffic — becomes a pandas DataFrame (table with labeled columns). Master .groupby(), .merge(), and .apply() and you can handle 90% of urban data tasks.

2

Functions make analysis reproducible

Wrap your analysis in functions with parameters (city name, year, threshold). Run for Semarang today, switch to Surabaya tomorrow — same code, different inputs.

3

Dictionaries are configuration objects

Store your city parameters (coordinates, EPSG, BPS code) in a dict. Pass it to every function. This is how you make the pipeline city-agnostic.

Prerequisites

Module 0 completed: urban-geo conda environment active
urban_analytics/ folder structure created

Build Your City Configuration

1

Create the city config notebook

Open JupyterLab and create notebooks/01_python_foundations.ipynb. This notebook will produce a reusable city config that every future module imports.

python

# Cell 1 — City Configuration
# Change this ONE dict to switch the entire pipeline to a new city

CITY = {
    "name": "Semarang",
    "province": "Jawa Tengah",
    "osm_query": "Semarang, Indonesia",
    "lat": -6.97,
    "lon": 110.42,
    "epsg_utm": 32749,
    "bps_code": "3374",
    "coastal": True,
    "bbox": [110.28, -7.12, 110.53, -6.88],
}

print(f"Working with: {CITY['name']}, {CITY['province']}")

Expected

Working with: Semarang, Jawa Tengah

2

Practice with district data

Build a DataFrame of sample districts. We'll replace this with real BPS data in Module 2.

python

# Cell 2 — Practice DataFrames
import pandas as pd

# Sample district data — we'll fetch real data in Module 2
districts = pd.DataFrame({
    "kecamatan": ["Kec A", "Kec B", "Kec C", "Kec D", "Kec E"],
    "population": [85000, 120000, 65000, 95000, 110000],
    "area_km2": [12.5, 8.3, 22.1, 15.6, 9.8],
    "elevation_m": [3, 85, 210, 45, 5],
})

# Derived columns
districts["density"] = (districts["population"] / districts["area_km2"]).round(0)
districts["flood_risk"] = districts["elevation_m"].apply(
    lambda e: "high" if e < 5 else ("medium" if e < 20 else "low")
)

print(districts.to_string(index=False))

Expected

kecamatan  population  area_km2  elevation_m  density flood_risk
    Kec A       85000      12.5            3   6800.0       high
    Kec B      120000       8.3           85  14458.0        low
    Kec C       65000      22.1          210   2941.0        low
    Kec D       95000      15.6           45   6090.0        low
    Kec E      110000       9.8            5   11224.0    medium

3

Groupby and aggregation

python

# Cell 3 — Summary by risk category
summary = districts.groupby("flood_risk").agg(
    n_kecamatan=("kecamatan", "count"),
    total_pop=("population", "sum"),
    avg_density=("density", "mean"),
).round(0)

print(summary)

4

Create reusable helper functions

python

# Cell 4 — Functions we'll reuse throughout the course
import math

def haversine(lat1, lon1, lat2, lon2):
    """Distance in km between two GPS points."""
    R = 6371
    dlat = math.radians(lat2 - lat1)
    dlon = math.radians(lon2 - lon1)
    a = math.sin(dlat/2)**2 + math.cos(math.radians(lat1)) * \
        math.cos(math.radians(lat2)) * math.sin(dlon/2)**2
    return R * 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))

def calculate_kdb(footprint_m2, land_area_m2):
    """KDB = building footprint / land parcel area."""
    return round(footprint_m2 / land_area_m2, 3) if land_area_m2 > 0 else 0

def classify_density(pop_per_km2):
    """Indonesian urban density classification."""
    if pop_per_km2 > 15000: return "very_high"
    if pop_per_km2 > 10000: return "high"
    if pop_per_km2 > 5000: return "medium"
    return "low"

# Test
print(f"Distance city center to bbox corner: "
      f"{haversine(CITY['lat'], CITY['lon'], CITY['bbox'][1], CITY['bbox'][0]):.1f} km")
print(f"KDB of 250m² on 400m² lot: {calculate_kdb(250, 400)}")
print(f"14,000 pop/km²: {classify_density(14000)}")

5

Save the city config for reuse

Export the CITY dict as JSON so every future notebook can import it.

python

# Cell 5 — Save config
import json
from pathlib import Path

config_path = Path("../data/city_config.json")
config_path.write_text(json.dumps(CITY, indent=2, ensure_ascii=False))
print(f"Saved to {config_path.resolve()}")

Artifacts produced → used by Module 2

data/city_config.json • notebooks/01_python_foundations.ipynb • Helper functions: haversine(), calculate_kdb(), classify_density()

Quick Recipes

Read CSV with Indonesian encoding

python

df = pd.read_csv("data.csv", encoding="utf-8", sep=";")  # BPS often uses semicolons

Filter rows by multiple conditions

python

high_risk = df[(df["elevation_m"] < 5) & (df["population"] > 50000)]

Pivot table for cross-tabulation

python

pivot = pd.pivot_table(df, values="population", index="kecamatan",
                       columns="year", aggfunc="sum")

Quick bar chart

python

import matplotlib.pyplot as plt

districts.sort_values("density").plot.barh(x="kecamatan", y="density", color="#2A7B9B")
plt.xlabel("Population Density (per km²)")
plt.title(f"Population Density by Kecamatan — {CITY['name']}")
plt.tight_layout()
plt.savefig("../output/density_chart.png", dpi=150)
plt.show()

pandas Essentials Reference

Operation	Code	Returns
Load CSV	`pd.read_csv(path)`	DataFrame
Shape	`df.shape`	(rows, cols) tuple
Column types	`df.dtypes`	Series of dtypes
Summary stats	`df.describe()`	count, mean, std, min, max
Filter rows	`df[df["col"] > val]`	Filtered DataFrame
New column	`df["new"] = expr`	Modified in place
Group + aggregate	`df.groupby("col").agg(...)`	Aggregated DataFrame
Merge tables	`pd.merge(a, b, on="key")`	Joined DataFrame
Apply function	`df["col"].apply(fn)`	Transformed Series
Sort	`df.sort_values("col")`	Sorted DataFrame
Drop NaN	`df.dropna(subset=["col"])`	Cleaned DataFrame
Save CSV	`df.to_csv(path, index=False)`	File on disk

Module 1 Check-In

Question 1 of 3

Why do we store city parameters in a dictionary rather than as separate variables?

Question 2 of 3

You have a DataFrame with 400,000 building records. You need the average footprint per kecamatan. What's the most efficient approach?

Question 3 of 3

What does df["density"] = df["population"] / df["area_km2"] do that a for-loop doesn't?

02

Acquiring City Data

Download real open data — administrative boundaries, buildings, roads, and census statistics — for Semarang using APIs, web scraping, and bulk downloads.

M0 Setup→ M1 Python→ M2 Data→ M3 GeoPandas

The Indonesian Open Data Landscape

Indonesia has rich — but scattered — geospatial data. The trick is knowing where to look and how to extract it programmatically so your pipeline stays reproducible.

🗺️

OpenStreetMap

Crowd-sourced map: buildings, roads, POIs. Excellent coverage in Javanese cities. API (Overpass) or bulk download (Geofabrik).

📊

BPS Statistics

National statistics bureau. Population by kelurahan, economic indicators, housing census. API available at webapi.bps.go.id.

🌐

InaGeoportal

Government geospatial portal. Admin boundaries, RDTR zoning, land parcels. Manual download (GeoJSON/SHP). Coverage varies.

🛰️

Earth Engine

Satellite imagery: Landsat, Sentinel-2, SRTM DEM. Global coverage, cloud-processed. Used in Module 6.

Prerequisites

Module 1 completed: data/city_config.json exists
urban-geo environment active

Download Real Data for Your City

1

Load your city config

Create notebooks/02_data_acquisition.ipynb.

python

import json, requests
import pandas as pd
import geopandas as gpd
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())
print(f"Acquiring data for: {CITY['name']}")

2

Download admin boundaries from OSM

OSMnx can download the boundary polygon of any named place in OpenStreetMap.

python

import osmnx as ox

# Download city boundary
boundary = ox.geocode_to_gdf(CITY["osm_query"])
boundary.to_file("../data/raw/admin_boundary.geojson", driver="GeoJSON")
print(f"Boundary area: {boundary.to_crs(epsg=CITY['epsg_utm']).area.iloc[0] / 1e6:.1f} km²")

Expected

Boundary area: ~373.7 km²  (varies by city)

3

Download buildings from Overpass API

The Overpass API lets you query OpenStreetMap for any feature type. We fetch all building footprints within the city's bounding box.

python

# Download building footprints via OSMnx
tags = {"building": True}
buildings = ox.features_from_place(CITY["osm_query"], tags=tags)

# Keep only polygons (discard nodes)
buildings = buildings[buildings.geometry.type.isin(["Polygon", "MultiPolygon"])].copy()

# Calculate footprint area in m²
buildings_utm = buildings.to_crs(epsg=CITY["epsg_utm"])
buildings["footprint_m2"] = buildings_utm.geometry.area.round(1)

# Save
cols = ["name", "building", "building:levels", "footprint_m2", "geometry"]
cols = [c for c in cols if c in buildings.columns]
buildings[cols].to_file("../data/raw/buildings.geojson", driver="GeoJSON")
print(f"Downloaded {len(buildings):,} building footprints")

Expected

Downloaded 45,000–200,000 building footprints (varies by city OSM coverage)

Large cities

For Jakarta or Surabaya, this may take several minutes and produce large files. Consider filtering by a smaller area first (pass bbox instead of place name).

4

Download road network

python

# Street network — we'll analyze this deeply in Module 4
G = ox.graph_from_place(CITY["osm_query"], network_type="drive")
ox.save_graphml(G, "../data/raw/street_network.graphml")

nodes, edges = ox.graph_to_gdfs(G)
print(f"Network: {len(nodes):,} intersections, {len(edges):,} road segments")

5

Fetch BPS population data

BPS provides an API for statistical tables. We fetch population by kecamatan.

python

# BPS API — population by sub-district
# Note: you need a free API key from webapi.bps.go.id
BPS_KEY = "YOUR_BPS_API_KEY"  # Register at webapi.bps.go.id

# Alternative: scrape the BPS website table
url = f"https://semarangkota.bps.go.id/indicator/12/71/1/jumlah-penduduk.html"
try:
    tables = pd.read_html(url)
    population = tables[0]
    population.to_csv("../data/raw/population_bps.csv", index=False)
    print(f"Scraped {len(population)} rows from BPS")
except Exception as e:
    print(f"BPS scraping failed: {e}")
    print("Tip: manually download from the BPS website and save as CSV")

BPS data tips

BPS website structure varies by city. If scraping fails, visit semarangkota.bps.go.id manually, find the population table, and save as CSV to data/raw/population_bps.csv. The pipeline continues either way.

6

Download POIs (points of interest)

python

# Key urban facilities
poi_tags = {
    "amenity": ["hospital", "school", "marketplace"],
    "shop": True,
}
pois = ox.features_from_place(CITY["osm_query"], tags=poi_tags)
pois_points = pois.copy()
pois_points["geometry"] = pois_points.geometry.centroid
pois_points[["name", "amenity", "shop", "geometry"]].dropna(
    subset=["name"]
).to_file("../data/raw/pois.geojson", driver="GeoJSON")
print(f"Downloaded {len(pois_points):,} POIs")

7

Verify your data inventory

python

# Quick inventory check
raw = Path("../data/raw")
for f in sorted(raw.glob("*")):
    size = f.stat().st_size / 1024
    unit = "KB"
    if size > 1024:
        size /= 1024; unit = "MB"
    print(f"  {f.name:<35} {size:>8.1f} {unit}")

Expected files

  admin_boundary.geojson                 45.2 KB
  buildings.geojson                      12.8 MB
  pois.geojson                          285.3 KB
  population_bps.csv                      3.2 KB
  street_network.graphml                  8.5 MB

Artifacts produced → used by Module 3

data/raw/admin_boundary.geojson • data/raw/buildings.geojson • data/raw/street_network.graphml • data/raw/pois.geojson • data/raw/population_bps.csv

Data Acquisition Recipes

Download OSM features by bounding box (faster for large cities)

python

buildings = ox.features_from_bbox(
    bbox=(north, south, east, west),  # use CITY["bbox"] values
    tags={"building": True}
)

Custom Overpass query (advanced filtering)

python

query = f"""
[out:json][timeout:120];
area["name"="{CITY['name']}"]["admin_level"="5"]->.city;
(way["building"="commercial"](area.city););
out body; >; out skel qt;
"""
response = requests.get("https://overpass-api.de/api/interpreter",
                        params={"data": query})

Download Geofabrik Indonesia extract

bash

wget https://download.geofabrik.de/asia/indonesia-latest-free.shp.zip
unzip indonesia-latest-free.shp.zip -d data/raw/geofabrik/

Data Sources Reference

Source	Function / URL	Returns	Coverage
OSMnx geocode	`ox.geocode_to_gdf(query)`	City boundary GeoDataFrame	All cities
OSMnx features	`ox.features_from_place(query, tags)`	GeoDataFrame of tagged features	All cities
OSMnx graph	`ox.graph_from_place(query)`	NetworkX MultiDiGraph	All cities
BPS API	`webapi.bps.go.id/v1/api/list`	JSON statistical tables	All kabupaten/kota
InaGeoportal	tanahair.indonesia.go.id	SHP / GeoJSON download	Varies
Geofabrik	download.geofabrik.de/asia/indonesia	SHP / PBF bulk extract	All Indonesia
SRTM DEM	via GEE: `USGS/SRTMGL1_003`	30m elevation raster	Global

Module 2 Check-In

Question 1 of 3

You want to count how many buildings are within each kecamatan. You have buildings as points and kecamatan as polygons. What operation is needed?

Question 2 of 3

Your downloaded buildings are in EPSG:4326 (GPS degrees) but you need footprint areas in square meters. What must you do first?

Question 3 of 3

Why do we save the city config as JSON rather than hardcoding values in each notebook?

03

Drawing the City

GeoPandas spatial analysis, KDB/KLB zoning compliance, and Uber H3 hexagonal grids — turning raw downloads into structured urban layers for Semarang.

M1 Python→ M2 Data→ M3 GeoPandas→ M4 Network

GeoDataFrame = Spreadsheet + Map

A GeoDataFrame is a pandas DataFrame with a geometry column. Every row has both tabular attributes (name, area, population) and a spatial shape (point, line, polygon). This means you can filter, group, and merge spatially — not just by column values.

📍

Points

Bus stops, building centroids, flood gauges

〰️

Lines

Roads, rivers, power lines, train routes

⬡

Polygons

Building footprints, zoning zones, flood zones

🗂️

CRS

Coordinate Reference System — maps Earth's curve onto flat numbers

KDB/KLB: Indonesian Zoning Decoded

Building
Footprint

Land Parcel (100%)The entire lot

KDB (footprint/land)Max ground coverage — e.g. 0.60 = 60%

KDH (green space)Min permeable surface

KLB (floor area/land)Total floors ÷ plot area

H3: The Universal Join Key

Uber's H3 divides Earth into hexagons at 16 resolution levels. Assign every dataset an H3 cell ID and you can join anything to anything — no complex spatial intersection needed.

Resolution	Avg Hex Area	Best For
`7`	~5.16 km²	District-level summaries
`8`	~0.74 km²	Neighborhood / OD flows
`9`	~0.11 km²	Block-level traffic analysis
`10`	~0.015 km²	Individual parcel scale

Prerequisites

Module 2 completed: data/raw/buildings.geojson and data/raw/admin_boundary.geojson exist
data/city_config.json available

Spatial Analysis & H3 Indexing

1

Load and inspect spatial data

Create notebooks/03_geopandas_h3.ipynb.

python

import json
import geopandas as gpd
import pandas as pd
import h3
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())

# Load Module 2 outputs
buildings = gpd.read_file("../data/raw/buildings.geojson")
boundary = gpd.read_file("../data/raw/admin_boundary.geojson")

print(f"Buildings: {len(buildings):,} features, CRS: {buildings.crs}")
print(f"Boundary CRS: {boundary.crs}")

2

Reproject and compute building metrics

python

# Reproject to UTM for metric calculations
buildings_utm = buildings.to_crs(epsg=CITY["epsg_utm"])

# Building footprint area (m²)
buildings["footprint_m2"] = buildings_utm.geometry.area.round(1)

# Estimate floors if available from OSM tags
if "building:levels" in buildings.columns:
    buildings["floors"] = pd.to_numeric(
        buildings["building:levels"], errors="coerce"
    ).fillna(1).astype(int)
else:
    buildings["floors"] = 1

buildings["floor_area_m2"] = buildings["footprint_m2"] * buildings["floors"]

print(buildings[["footprint_m2", "floors", "floor_area_m2"]].describe().round(1))

3

Assign H3 cells to every building

python

# Convert building centroids to H3 cells at resolution 9
H3_RES = 9

centroids = buildings.geometry.centroid.to_crs(epsg=4326)
buildings["h3_cell"] = [
    h3.latlng_to_cell(pt.y, pt.x, H3_RES) for pt in centroids
]

print(f"Unique H3 cells: {buildings['h3_cell'].nunique():,}")
print(f"Sample cell: {buildings['h3_cell'].iloc[0]}")

Expected

Unique H3 cells: ~1,200–3,000
Sample cell: 89600c1a2ffffff

4

Aggregate building metrics per H3 cell

python

# Aggregate per hexagon
h3_stats = buildings.groupby("h3_cell").agg(
    building_count=("footprint_m2", "count"),
    total_footprint=("footprint_m2", "sum"),
    avg_footprint=("footprint_m2", "mean"),
    max_floors=("floors", "max"),
    total_floor_area=("floor_area_m2", "sum"),
).round(1)

# Create H3 hexagon geometries for mapping
def h3_to_polygon(cell_id):
    boundary = h3.cell_to_boundary(cell_id)
    return Polygon([(lng, lat) for lat, lng in boundary])

h3_grid = gpd.GeoDataFrame(
    h3_stats,
    geometry=[h3_to_polygon(c) for c in h3_stats.index],
    crs="EPSG:4326"
)
h3_grid.index.name = "h3_cell"

print(f"H3 grid: {len(h3_grid)} hexagons")
print(h3_grid.head())

5

Simulate KDB compliance check

Without official parcel boundaries, we approximate KDB at the H3 cell level: total building footprint / hexagon area.

python

# H3 res-9 hex area in m² (approximate)
HEX_AREA_M2 = 105000  # ~0.105 km²

h3_grid["kdb_approx"] = (h3_grid["total_footprint"] / HEX_AREA_M2).round(3)
h3_grid["density_class"] = pd.cut(
    h3_grid["kdb_approx"],
    bins=[0, 0.1, 0.3, 0.5, 1.0, float("inf")],
    labels=["sparse", "low", "medium", "high", "very_high"]
)

print(h3_grid["density_class"].value_counts())

6

Plot and save outputs

python

# Map: building density by H3 cell
fig, ax = plt.subplots(figsize=(12, 10))
boundary.to_crs(epsg=4326).boundary.plot(ax=ax, color="black", linewidth=1)
h3_grid.plot(column="kdb_approx", cmap="YlOrRd", legend=True,
             ax=ax, alpha=0.7, edgecolor="grey", linewidth=0.2)
ax.set_title(f"Building Density (KDB approx) — {CITY['name']}")
plt.savefig("../output/h3_building_density.png", dpi=150, bbox_inches="tight")
plt.show()

# Save processed data
buildings.to_file("../data/processed/buildings_processed.gpkg", driver="GPKG")
h3_grid.reset_index().to_file("../data/processed/h3_grid.gpkg", driver="GPKG")
print("Saved: buildings_processed.gpkg, h3_grid.gpkg")

Artifacts produced → used by Modules 4, 5, 7

data/processed/buildings_processed.gpkg • data/processed/h3_grid.gpkg • output/h3_building_density.png

GeoPandas & H3 Recipes

Reproject to local UTM

python

gdf_utm = gdf.to_crs(epsg=CITY["epsg_utm"])  # meters for area/distance

Spatial join: points in polygons

python

joined = gpd.sjoin(buildings, zoning, how="left", predicate="within")

Buffer around coastline

python

coast_buffer = coastline.to_crs(epsg=CITY["epsg_utm"]).buffer(500)  # 500m

H3: k-ring smoothing

python

# Average metric over a cell and its 6 neighbors
def kring_smooth(row, df, col, k=1):
    neighbors = h3.grid_disk(row.name, k)
    vals = df.loc[df.index.isin(neighbors), col]
    return vals.mean()

h3_grid["density_smooth"] = h3_grid.apply(
    kring_smooth, axis=1, df=h3_grid, col="building_count"
)

GeoPandas & H3 Reference

Operation	Code
Read file	`gpd.read_file("path.gpkg")`
Reproject	`gdf.to_crs(epsg=32749)`
Area (m²)	`gdf_utm.geometry.area`
Centroid	`gdf.geometry.centroid`
Buffer	`gdf_utm.buffer(500)` (meters)
Spatial join	`gpd.sjoin(left, right, predicate="within")`
Overlay	`gpd.overlay(a, b, how="intersection")`
Save GeoPackage	`gdf.to_file("out.gpkg", driver="GPKG")`
H3: point to cell	`h3.latlng_to_cell(lat, lon, res)`
H3: cell boundary	`h3.cell_to_boundary(cell_id)`
H3: neighbors	`h3.grid_disk(cell_id, k)`
H3: resolution	`h3.get_resolution(cell_id)`

KDB Calculator

Land Area (m²)

Building Footprint (m²)

Total Floor Area (m²)

Zone Type

—

KDB Actual

—

KLB Actual

—

Green Space %

Module 3 Check-In

Question 1 of 3

A K-1 commercial zone has KDB max 0.80. A building occupies 500 m² on a 600 m² lot. Is it compliant?

Question 2 of 3

Traffic data and flood risk data have different polygon boundaries. How do you combine them?

Question 3 of 3

For neighborhood-block-scale analysis (~100-300m), which H3 resolution?

04

City as a Network

OSMnx and NetworkX turn street maps into mathematical graphs — revealing chokepoints, accessibility gaps, and optimal facility locations in Semarang.

M2 Data→ M3 GeoPandas→ M4 Network→ M5 PostGIS

Streets as Social Networks

Think of your city's street network like a social network. Intersections are people (nodes). Roads are friendships (edges). Some intersections are on the shortest path between everyone — they're the bottlenecks. Others sit in dead-end corners.

🔴

Betweenness Centrality

How often does traffic pass through this intersection? High = critical chokepoint.

🔵

Closeness Centrality

How quickly can you reach everywhere else? High = well-connected center.

⏱️

Isochrones

All points reachable within N minutes — service coverage halos.

Prerequisites

Module 2: data/raw/street_network.graphml exists
Module 3: data/processed/h3_grid.gpkg exists

Network Analysis for Your City

1

Load the street network

Create notebooks/04_network_analysis.ipynb.

python

import json
import osmnx as ox
import networkx as nx
import geopandas as gpd
import matplotlib.pyplot as plt
from pathlib import Path

CITY = json.loads(Path("../data/city_config.json").read_text())

# Load the graph saved in Module 2
G = ox.load_graphml("../data/raw/street_network.graphml")
G_undir = ox.convert.to_undirected(G)

nodes, edges = ox.graph_to_gdfs(G_undir)
print(f"Network: {len(nodes):,} nodes, {len(edges):,} edges")

2

Calculate centrality metrics

This takes 2-10 minutes depending on city size. For large networks, sample a subgraph first.

python

# Betweenness centrality (may take several minutes for large cities)
print("Calculating betweenness centrality...")
bc = nx.betweenness_centrality(G_undir, weight="length")

print("Calculating closeness centrality...")
cc = nx.closeness_centrality(G_undir, distance="length")

# Attach to nodes GeoDataFrame
nodes["betweenness"] = nodes.index.map(bc)
nodes["closeness"] = nodes.index.map(cc)

print(f"Top betweenness: {nodes['betweenness'].max():.6f}")
print(f"Top closeness: {nodes['closeness'].max():.6f}")

Large cities (Jakarta, Surabaya)

For 50k+ nodes, use nx.betweenness_centrality(G, k=500) to sample 500 random source nodes. This gives a good approximation in seconds instead of hours.

3

Generate isochrones

An isochrone shows all locations reachable within a time limit from a starting point. We'll generate 5, 10, and 15-minute walking isochrones from the city center.

python

# Walk network for isochrones
G_walk = ox.graph_from_place(CITY["osm_query"], network_type="walk")

# Find nearest node to city center
center_node = ox.nearest_nodes(G_walk, CITY["lon"], CITY["lat"])

# Walking speed ~ 4.5 km/h = 1.25 m/s
WALK_SPEED = 1.25
for u, v, data in G_walk.edges(data=True):
    data["time"] = data["length"] / WALK_SPEED

# Isochrone polygons
isochrones = []
for minutes in [5, 10, 15]:
    subgraph = nx.ego_graph(G_walk, center_node,
                            radius=minutes * 60, distance="time")
    iso_nodes = ox.graph_to_gdfs(subgraph, edges=False)
    hull = iso_nodes.unary_union.convex_hull
    isochrones.append({"minutes": minutes, "geometry": hull})

iso_gdf = gpd.GeoDataFrame(isochrones, crs="EPSG:4326")
iso_gdf.to_file("../data/processed/isochrones.gpkg", driver="GPKG")
print(f"Generated {len(iso_gdf)} isochrones")

4

Plot centrality map and save

python

# Betweenness centrality map
fig, ax = plt.subplots(figsize=(14, 10))
edges_gdf = ox.graph_to_gdfs(G_undir, nodes=False)
edges_gdf.plot(ax=ax, color="#ddd", linewidth=0.3)
nodes.plot(ax=ax, column="betweenness", cmap="hot_r",
           markersize=nodes["betweenness"] * 50000,
           alpha=0.7, legend=True)
ax.set_title(f"Betweenness Centrality — {CITY['name']}")
ax.set_axis_off()
plt.savefig("../output/betweenness_map.png", dpi=150, bbox_inches="tight")
plt.show()

# Save centrality nodes
nodes.to_file("../data/processed/centrality_nodes.gpkg", driver="GPKG")
print("Saved: centrality_nodes.gpkg")

Artifacts produced → used by Modules 5, 7

data/processed/centrality_nodes.gpkg • data/processed/isochrones.gpkg • output/betweenness_map.png

Network Recipes

Shortest path between two points

python

orig = ox.nearest_nodes(G, lon1, lat1)
dest = ox.nearest_nodes(G, lon2, lat2)
route = ox.shortest_path(G, orig, dest, weight="length")
ox.plot_graph_route(G, route)

Network stats summary

python

stats = ox.basic_stats(G, area=boundary_area_m2)
print(f"Street density: {stats['street_density_km']:.1f} km/km²")
print(f"Avg edge length: {stats['street_length_avg']:.0f} m")

Sample centrality for large networks

python

bc_approx = nx.betweenness_centrality(G, k=500, weight="length")  # fast approximation

Network Analysis Reference

Function	Purpose
`ox.graph_from_place(q)`	Download street graph for a named place
`ox.graph_from_bbox(n,s,e,w)`	Download by bounding box
`ox.save_graphml(G, path)`	Save graph to disk
`ox.load_graphml(path)`	Reload saved graph
`ox.graph_to_gdfs(G)`	Convert to GeoDataFrames (nodes, edges)
`ox.nearest_nodes(G, lon, lat)`	Find nearest node to a coordinate
`ox.shortest_path(G, o, d)`	Find shortest path between two nodes
`nx.betweenness_centrality(G)`	How often a node lies on shortest paths
`nx.closeness_centrality(G)`	Average inverse distance to all other nodes
`nx.ego_graph(G, n, radius=r)`	Subgraph within radius of node n

Module 4 Check-In

Question 1 of 3

You need to place a new bus terminal. Should you optimize for betweenness centrality or closeness centrality?

Question 2 of 3

One ring road has betweenness 10x higher than any other road. What does this mean?

Question 3 of 3

How do you measure 15-minute walkability for residential areas?

05

The City's Long Memory

PostGIS turns PostgreSQL into a spatial database — permanent storage for millions of records with lightning-fast spatial queries via SQL.

M3 GeoPandas→ M4 Network→ M5 PostGIS→ M6 GEE

Why a Database Instead of Files?

GeoPandas loads everything into RAM. Fine for 50,000 buildings. But when you have 400,000 buildings across 30 years, multiple team members, and queries like "find every building within 500m of the coast that violates KDB" — you need a database.

📦

Persistent Storage

Millions of records on disk. No reloading CSVs on every restart.

⚡

GIST Index

Spatial index makes "find within polygon" run in milliseconds.

🔗

Multi-User

Entire team queries the same database simultaneously.

🗄️

Materialized Views

Pre-compute expensive spatial joins once, refresh on demand.

Urban Scaling Laws

When a city doubles in population, some metrics scale sublinearly (infrastructure — economies of scale), linearly (housing), or superlinearly (innovation, congestion). The exponent β reveals the pattern.

β < 1

Sublinear

Roads, gas stations, power cables

β ≈ 1

Linear

Housing, water use, total jobs

β > 1

Superlinear

Innovation, wealth, congestion, crime

Prerequisites

Module 0: PostGIS database urban_semarang created
Modules 2-4: processed GeoPackage files in data/processed/

Load Everything Into PostGIS

1

Load spatial data using ogr2ogr

ogr2ogr (part of GDAL) bulk-loads spatial files into PostGIS tables.

bash

# Load buildings
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/buildings_processed.gpkg \
  -nln buildings -overwrite -lco GEOMETRY_NAME=geom

# Load H3 grid
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/h3_grid.gpkg \
  -nln h3_grid -overwrite -lco GEOMETRY_NAME=geom

# Load centrality nodes
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/processed/centrality_nodes.gpkg \
  -nln centrality_nodes -overwrite -lco GEOMETRY_NAME=geom

# Load admin boundary
ogr2ogr -f "PostgreSQL" \
  PG:"dbname=urban_semarang user=postgres" \
  data/raw/admin_boundary.geojson \
  -nln admin_boundary -overwrite -lco GEOMETRY_NAME=geom

2

Create spatial indexes

sql

-- Connect: psql -d urban_semarang

CREATE INDEX IF NOT EXISTS idx_buildings_geom ON buildings USING GIST(geom);
CREATE INDEX IF NOT EXISTS idx_h3_grid_geom ON h3_grid USING GIST(geom);
CREATE INDEX IF NOT EXISTS idx_centrality_geom ON centrality_nodes USING GIST(geom);

-- Verify tables loaded
SELECT tablename, pg_size_pretty(pg_total_relation_size(tablename::text))
FROM pg_tables
WHERE schemaname = 'public';

Expected

 tablename        | pg_size_pretty
------------------+----------------
 buildings        | 45 MB
 h3_grid          | 512 KB
 centrality_nodes | 8 MB
 admin_boundary   | 64 KB

3

Spatial queries: buildings near the coast

sql

-- Find buildings within 500m of a coastal point
SELECT count(*) AS coastal_buildings,
       round(avg(footprint_m2)::numeric, 1) AS avg_footprint
FROM buildings
WHERE ST_DWithin(
    geom::geography,
    ST_SetSRID(ST_MakePoint(110.42, -6.97), 4326)::geography,
    500  -- meters
);

4

Building density per H3 cell (SQL version)

sql

-- Join buildings to H3 hexagons and aggregate
CREATE MATERIALIZED VIEW h3_building_summary AS
SELECT
    h.h3_cell,
    h.geom,
    count(b.*) AS building_count,
    coalesce(sum(b.footprint_m2), 0) AS total_footprint,
    coalesce(avg(b.footprint_m2), 0)::numeric(10,1) AS avg_footprint
FROM h3_grid h
LEFT JOIN buildings b ON ST_Contains(h.geom, b.geom)
GROUP BY h.h3_cell, h.geom;

CREATE INDEX idx_h3_summary_geom ON h3_building_summary USING GIST(geom);

-- Check result
SELECT count(*) AS hexagons,
       sum(building_count) AS total_buildings
FROM h3_building_summary;

5

Query from Python (for the dashboard)

Create notebooks/05_postgis.ipynb and test querying from Python.

python

import geopandas as gpd
from sqlalchemy import create_engine

engine = create_engine("postgresql://postgres@localhost/urban_semarang")

# Load H3 building summary as GeoDataFrame
h3_summary = gpd.read_postgis(
    "SELECT * FROM h3_building_summary",
    engine, geom_col="geom"
)
print(f"Loaded {len(h3_summary)} hexagons from PostGIS")
print(h3_summary[["h3_cell", "building_count", "total_footprint"]].head())

Artifacts produced → used by Module 7

PostGIS database urban_semarang with tables: buildings, h3_grid, centrality_nodes, admin_boundary • Materialized view: h3_building_summary

PostGIS Recipes

Find all features within a polygon

sql

SELECT * FROM buildings WHERE ST_Within(geom, (SELECT geom FROM zoning WHERE zone_code = 'K-1'));

Buffer + count (coastal exposure)

sql

SELECT count(*) FROM buildings
WHERE ST_DWithin(geom::geography,
  (SELECT ST_Boundary(geom)::geography FROM admin_boundary LIMIT 1), 1000);

Refresh a materialized view

sql

REFRESH MATERIALIZED VIEW h3_building_summary;

Urban scaling: β exponent via SQL

sql

SELECT REGR_SLOPE(ln(built_area), ln(population)) AS beta
FROM city_stats
WHERE population > 100000;

PostGIS Function Reference

Function	Purpose
`ST_Contains(a, b)`	True if geometry a fully contains b
`ST_Within(a, b)`	True if a is fully within b
`ST_Intersects(a, b)`	True if a and b share any space
`ST_DWithin(a, b, d)`	True if a and b are within distance d
`ST_Buffer(geom, d)`	Create buffer zone of distance d around geometry
`ST_Area(geom)`	Area in CRS units (use geography for m²)
`ST_Distance(a, b)`	Distance between geometries
`ST_Transform(geom, srid)`	Reproject to a different CRS
`ST_SetSRID(geom, srid)`	Assign CRS without transforming coordinates
`ST_MakePoint(lon, lat)`	Create point from coordinates

Module 5 Check-In

Question 1 of 2

2M building records spanning 20 years. "Which kelurahan had the most green-to-built conversion?" GeoPandas or PostGIS?

Question 2 of 2

What does ST_Contains(z.geom, b.geom) check?

06

Eyes in the Sky

Google Earth Engine processes petabytes of satellite imagery in the cloud — land-use classification, change detection, and coastal risk modeling for Semarang.

M4 Network→ M5 PostGIS→ M6 GEE→ M7 Dashboard

A DVR for the Planet

GEE gives you access to decades of Landsat and Sentinel imagery, processed on Google's servers. The course uses it to answer two questions: "How has land use changed?" and "Which areas face the highest flood risk?"

1

Filter Image Collection

Select Landsat 8 scenes from 2023, over your city, with <20% cloud cover.

2

Compute Spectral Indices

NDVI (vegetation), NDBI (built-up), MNDWI (water) — band-math that reveals what's on the ground.

3

Classify Land Use

Train a Random Forest on labeled samples, classify every pixel.

4

Detect Change

Compare 2015 vs 2023 classifications pixel-by-pixel.

Risk Layer Mixer

Toggle risk dimensions to see how the composite score changes.

Hazard (elevation, slope, coast distance)w = 0.40

Exposure (population, built area, assets)w = 0.35

Vulnerability (green cover, impervious surface)w = 0.25

Composite Risk100%

0.73

Simulated score for a low-elevation coastal kelurahan

Prerequisites

Module 0: earthengine authenticate completed
Module 2: data/raw/admin_boundary.geojson for city extent

Satellite Analysis & Risk Modeling

1

Initialize GEE and define study area

Create notebooks/06_earth_engine.ipynb.

python

import json, ee
import geemap
from pathlib import Path

ee.Initialize()
CITY = json.loads(Path("../data/city_config.json").read_text())

# Define study area from bounding box
bbox = CITY["bbox"]
aoi = ee.Geometry.Rectangle([bbox[0], bbox[1], bbox[2], bbox[3]])

print(f"Study area: {CITY['name']}")
print(f"Bounding box: {bbox}")

2

Create cloud-free composites for two years

python

def get_composite(year, aoi):
    """Cloud-masked median composite from Landsat 8/9."""
    collection = (ee.ImageCollection("LANDSAT/LC08/C02/T1_L2")
        .merge(ee.ImageCollection("LANDSAT/LC09/C02/T1_L2"))
        .filterBounds(aoi)
        .filterDate(f"{year}-01-01", f"{year}-12-31")
        .filter(ee.Filter.lt("CLOUD_COVER", 20)))

    def mask_clouds(img):
        qa = img.select("QA_PIXEL")
        mask = qa.bitwiseAnd(1 << 3).eq(0).And(qa.bitwiseAnd(1 << 4).eq(0))
        return img.updateMask(mask)

    return collection.map(mask_clouds).median().clip(aoi)

composite_2015 = get_composite(2015, aoi)
composite_2023 = get_composite(2023, aoi)
print("Composites ready")

3

Compute spectral indices

python

def add_indices(img):
    """Add NDVI, NDBI, MNDWI to a Landsat image."""
    ndvi = img.normalizedDifference(["SR_B5", "SR_B4"]).rename("NDVI")
    ndbi = img.normalizedDifference(["SR_B6", "SR_B5"]).rename("NDBI")
    mndwi = img.normalizedDifference(["SR_B3", "SR_B6"]).rename("MNDWI")
    return img.addBands([ndvi, ndbi, mndwi])

img_2015 = add_indices(composite_2015)
img_2023 = add_indices(composite_2023)

# Visualize NDVI
Map = geemap.Map(center=[CITY["lat"], CITY["lon"]], zoom=11)
Map.addLayer(img_2023.select("NDVI"), {"min": -0.2, "max": 0.8, "palette": ["red", "yellow", "green"]}, "NDVI 2023")
Map

4

Classify land use with Random Forest

python

# Training points (replace with your own labeled samples)
# Classes: 0=water, 1=vegetation, 2=built-up, 3=bare_soil
bands = ["SR_B2","SR_B3","SR_B4","SR_B5","SR_B6","SR_B7","NDVI","NDBI","MNDWI"]

# Use a GEE training dataset or create your own FeatureCollection
# For reproducibility, use ESA WorldCover as pseudo-labels
worldcover = ee.Image("ESA/WorldCover/v200").clip(aoi)

# Sample training points from WorldCover
training = img_2023.select(bands).addBands(worldcover.rename("label")).sample(
    region=aoi, scale=30, numPixels=5000, seed=42
)

# Train Random Forest
classifier = ee.Classifier.smileRandomForest(50).train(
    features=training, classProperty="label", inputProperties=bands
)

# Classify both years
classified_2015 = img_2015.select(bands).classify(classifier)
classified_2023 = img_2023.select(bands).classify(classifier)
print("Classification complete")

5

Build the risk composite

python

# Elevation from SRTM
dem = ee.Image("USGS/SRTMGL1_003").clip(aoi)
elevation = dem.select("elevation")
slope = ee.Terrain.slope(dem)

# Normalize to 0-1 (higher value = more hazardous)
h_elev = elevation.multiply(-1).add(50).divide(50).clamp(0, 1)  # low elevation = high risk
h_slope = slope.multiply(-1).add(30).divide(30).clamp(0, 1)

# Built-up density as exposure proxy
ndbi_norm = img_2023.select("NDBI").add(1).divide(2).clamp(0, 1)

# Green cover as vulnerability proxy (less green = more vulnerable)
ndvi_inv = img_2023.select("NDVI").multiply(-1).add(1).divide(2).clamp(0, 1)

# Composite
risk = (h_elev.multiply(0.40)
    .add(ndbi_norm.multiply(0.35))
    .add(ndvi_inv.multiply(0.25)))

# Visualize
Map = geemap.Map(center=[CITY["lat"], CITY["lon"]], zoom=11)
Map.addLayer(risk, {"min": 0, "max": 1, "palette": ["green", "yellow", "red"]}, "Risk")
Map

6

Export results for the dashboard

python

# Export risk raster to Google Drive (then download to output/)
task = ee.batch.Export.image.toDrive(
    image=risk,
    description=f"risk_composite_{CITY['name'].lower()}",
    folder="urban_analytics",
    region=aoi,
    scale=30,
    maxPixels=1e9
)
task.start()
print(f"Export started. Check Google Drive > urban_analytics/")

# Also export the change detection
change = classified_2023.subtract(classified_2015).rename("change")
task2 = ee.batch.Export.image.toDrive(
    image=change, description=f"land_change_{CITY['name'].lower()}",
    folder="urban_analytics", region=aoi, scale=30, maxPixels=1e9
)
task2.start()
print("Change detection export started")

After export completes

Download the GeoTIFFs from Google Drive to output/risk_composite.tif and output/land_change.tif. These feed into the dashboard in Module 7.

Artifacts produced → used by Module 7

output/risk_composite.tif • output/land_change.tif (via Google Drive)

GEE Recipes

NDVI time series for a point

python

point = ee.Geometry.Point([CITY["lon"], CITY["lat"]])
ts = (ee.ImageCollection("LANDSAT/LC08/C02/T1_L2")
    .filterBounds(point).select("SR_B5", "SR_B4")
    .map(lambda img: img.normalizedDifference(["SR_B5", "SR_B4"]).rename("NDVI")
         .set("date", img.date().format())))

Bathtub flood model

python

flood_level = 2  # meters above sea level
flooded = dem.select("elevation").lte(flood_level)
Map.addLayer(flooded.selfMask(), {"palette": ["blue"]}, f"Flood {flood_level}m")

Export to local file (via geemap)

python

geemap.ee_export_image(risk, filename="../output/risk.tif", scale=30, region=aoi)

GEE Reference

Index	Formula	Measures
NDVI	(NIR − Red) / (NIR + Red)	Vegetation density/health
NDBI	(SWIR1 − NIR) / (SWIR1 + NIR)	Built-up/impervious surface
MNDWI	(Green − SWIR1) / (Green + SWIR1)	Water bodies

Collection	ID	Resolution
Landsat 8/9 SR	`LANDSAT/LC08/C02/T1_L2`	30m, 16-day
Sentinel-2 SR	`COPERNICUS/S2_SR_HARMONIZED`	10m, 5-day
SRTM DEM	`USGS/SRTMGL1_003`	30m, static
ESA WorldCover	`ESA/WorldCover/v200`	10m, 2021

Module 6 Check-In

Question 1 of 2

How do you map vegetation loss between 2015 and 2023?

Question 2 of 2

What does NDVI measure?

07

The Living Dashboard

Streamlit, pydeck, and HERE Traffic — assembling everything into an interactive digital twin of Semarang that decision-makers can explore.

M3 GeoPandas→ M5 PostGIS→ M6 GEE→ M7 Dashboard

The City's Shadow

A digital twin is a living, data-driven mirror of a physical system. The dashboard brings together everything from the previous 6 modules into one explorable interface.

🖥️

Streamlit

Python scripts become web apps. No HTML/CSS needed — st.map(), st.slider().

🗺️

pydeck / deck.gl

GPU-accelerated 3D maps. H3HexagonLayer, ArcLayer for OD flows.

🚗

HERE Traffic API

Real-time traffic speed, incidents, routing. Freemium (250k calls/mo).

🏍️

Ojol Flows

Ride-hailing OD flows as arc layers — demand corridors and transit gaps.

Dashboard Architecture

1

Data Layer: PostGIS + GEE exports

Buildings, zoning, risk rasters, census data in PostgreSQL.

psycopg2

2

Live Feed: HERE Traffic API

Real-time traffic flow for major corridors, assigned to H3 cells.

requests

3

H3 Aggregation

All metrics unified on a single hex grid. One ID joins everything.

h3-py

4

Visualization: pydeck 3D Layers

H3HexagonLayer for density, ArcLayer for OD flows.

pydeck

5

Interaction: Streamlit Widgets

Sliders, date pickers, zone selectors. Live map updates.

streamlit

Prerequisites

Module 5: PostGIS database urban_semarang with all tables loaded
Module 6: output/risk_composite.tif downloaded from GEE
Module 3: data/processed/h3_grid.gpkg

Build the Dashboard

1

Create the Streamlit app file

Create scripts/dashboard.py.

python

# scripts/dashboard.py
import json
import streamlit as st
import pandas as pd
import geopandas as gpd
import pydeck as pdk
import h3
from sqlalchemy import create_engine
from pathlib import Path

# ─── Config ───
CITY = json.loads(Path("data/city_config.json").read_text())
DB_URL = f"postgresql://postgres@localhost/urban_{CITY['name'].lower()}"
engine = create_engine(DB_URL)

st.set_page_config(page_title=f"{CITY['name']} Urban Dashboard", layout="wide")
st.title(f"🏙️ {CITY['name']} Urban Analytics Dashboard")

2

Load H3 data from PostGIS

python

# ─── Data Loading (cached) ───
@st.cache_data
def load_h3_data():
    df = gpd.read_postgis(
        "SELECT * FROM h3_building_summary",
        engine, geom_col="geom"
    )
    # Convert H3 geometry to lat/lon for pydeck
    df["lat"] = df.geometry.centroid.y
    df["lon"] = df.geometry.centroid.x
    return df

h3_data = load_h3_data()
st.sidebar.metric("Total Hexagons", len(h3_data))
st.sidebar.metric("Total Buildings", f"{h3_data['building_count'].sum():,}")

3

Add the H3 hexagon map

python

# ─── H3 Building Density Map ───
st.subheader("Building Density by H3 Cell")

metric = st.selectbox("Color by", ["building_count", "total_footprint", "avg_footprint"])

layer = pdk.Layer(
    "H3HexagonLayer",
    data=h3_data,
    get_hexagon="h3_cell",
    get_fill_color=f"[255, (1 - {metric} / {h3_data[metric].quantile(0.95)}) * 255, 0, 160]",
    get_elevation=metric,
    elevation_scale=2,
    extruded=True,
    pickable=True,
)

view = pdk.ViewState(
    latitude=CITY["lat"], longitude=CITY["lon"],
    zoom=11, pitch=45
)

st.pydeck_chart(pdk.Deck(layers=[layer], initial_view_state=view,
    tooltip={"text": f"H3: {{h3_cell}}\\n{metric}: {{{metric}}}"}))

4

Add HERE Traffic layer (optional)

python

# ─── HERE Traffic (optional — needs API key) ───
import requests

HERE_KEY = st.sidebar.text_input("HERE API Key", type="password")

if HERE_KEY:
    bbox_str = f"{CITY['bbox'][1]},{CITY['bbox'][0]},{CITY['bbox'][3]},{CITY['bbox'][2]}"
    url = f"https://data.traffic.hereapi.com/v7/flow?in=bbox:{bbox_str}&apiKey={HERE_KEY}"

    @st.cache_data(ttl=300)
    def fetch_traffic():
        resp = requests.get(url)
        if resp.status_code == 200:
            results = resp.json().get("results", [])
            rows = []
            for r in results:
                loc = r.get("location", {})
                flow = r.get("currentFlow", {})
                if "shape" in loc and "speed" in flow:
                    coords = loc["shape"]["links"][0]["points"]
                    mid = coords[len(coords)//2]
                    rows.append({
                        "lat": mid["lat"], "lon": mid["lng"],
                        "speed": flow["speed"],
                        "jam_factor": flow.get("jamFactor", 0),
                    })
            return pd.DataFrame(rows)
        return pd.DataFrame()

    traffic = fetch_traffic()
    if not traffic.empty:
        st.subheader(f"Live Traffic — {len(traffic)} segments")
        traffic["h3_cell"] = [h3.latlng_to_cell(r.lat, r.lon, 9) for _, r in traffic.iterrows()]
        st.dataframe(traffic.head(10))

5

Run the dashboard

bash

cd urban_analytics
streamlit run scripts/dashboard.py

Expected

  Local URL: http://localhost:8501
  Network URL: http://192.168.x.x:8501

Final output

Running Streamlit dashboard at localhost:8501 integrating PostGIS, H3, GEE risk maps, and (optionally) live HERE Traffic data for Semarang.

Dashboard Recipes

pydeck ArcLayer for OD flows

python

arc_layer = pdk.Layer("ArcLayer", data=od_df,
    get_source_position=["origin_lon", "origin_lat"],
    get_target_position=["dest_lon", "dest_lat"],
    get_source_color=[0, 128, 200],
    get_target_color=[200, 0, 80],
    get_width="trip_count", width_scale=0.5)

Streamlit sidebar filters

python

min_buildings = st.sidebar.slider("Min buildings per hex", 0, 500, 10)
filtered = h3_data[h3_data["building_count"] >= min_buildings]

Cache PostGIS queries

python

@st.cache_data(ttl=600)  # cache for 10 minutes
def query_buildings(zone_code):
    return gpd.read_postgis(f"SELECT * FROM buildings WHERE zone = '{zone_code}'", engine)

Streamlit & pydeck Reference

Component	Code
Text	`st.title() / st.header() / st.write()`
DataFrame	`st.dataframe(df)`
Chart	`st.bar_chart(df) / st.line_chart(df)`
Map	`st.pydeck_chart(deck)`
Slider	`st.slider("Label", min, max, default)`
Select	`st.selectbox("Label", options)`
Sidebar	`st.sidebar.slider(...)`
Caching	`@st.cache_data(ttl=N)`
H3 Layer	`pdk.Layer("H3HexagonLayer", ...)`
Arc Layer	`pdk.Layer("ArcLayer", ...)`
Scatter Layer	`pdk.Layer("ScatterplotLayer", ...)`

HERE Traffic API

Endpoint	URL Pattern	Returns
Flow	`/v7/flow?in=bbox:S,W,N,E`	Speed, jamFactor per road segment
Incidents	`/v7/incidents?in=bbox:S,W,N,E`	Accidents, road closures, construction

Final Check-In

Question 1 of 2

Which H3 hexagons have both high flood risk AND high ojol demand? Which layers do you combine?

Question 2 of 2

Why is H3 the common spatial index across all dashboard layers?

🏆

Course Complete

You've built an end-to-end urban analytics pipeline — from raw open data to a living dashboard — reproducible for any Indonesian city. Change the city in city_config.json and re-run every notebook. The same code works everywhere.