[methodology]

TECHNICAL
DEEP DIVE

How we transform fragmented global conflict data into actionable intelligence through autonomous collection, normalization, and aggregation.

The Challenge

DATASILOS

Global conflict data is often fragmented across disparate sources—academic studies, NGO field reports, and news aggregators. This fragmentation creates blind spots.

Without a unified view, humanitarian response is reactive rather than proactive. We bridge these gaps by normalizing and visualizing multi-source data in a single accessible interface.

Architecture

TECHSTACK

A modern, scalable architecture designed for real-time data ingestion, processing, and high-performance visualization.

Layer
Technology / Tools
Data ETL (Data Pipeline)
Prefect + Python
Web Crawling
Python
Data Persistence
PostgreSQL + PostGIS (for relational, geospatial & time series data & flexible document structures)
API Layer / Backend
Supabase
Frontend / Visualization
React (Next.js)
Geodata / Maps
Leaflet / GeoJson / ThreeJs / D3.js
Container / Orchestration
Docker, Kubernetes
Standard 01

COLLECT

Before analysis begins, we must gather intelligence. We autonomously scrape, fetch, and ingest raw reports from verified global conflict databases like UCDP, ACLED, AWSD and REST Countries.

Collection Visualization
Standard 02

NORMALIZE

Raw data varies wildly—timestamps, coordinates, and event types are inconsistent across sources. We ingest disparate formats and map them to a unified schema through a rigorous Silver Layer ETL process:

  • 01.Dimension Table CreationCreated dim_country based on the REST Countries API reference data, establishing the canonical country_name_common and country_name_official keys.
  • 02.Fact Table StandardizationTransformed raw event logs into normalized fact tables (fact_acled_events, fact_ucdp_gedevents, fact_ngo_incidents) and converted raw coordinates into PostGIS GEOGRAPHY types.
  • 03.Geospatial LinkingJoined all event fact tables with dim_country using the country_name_common key to enforce referential integrity and spatial consistency.
  • 04.EnrichmentIntegrated auxiliary datasets including fact_acled_conflict_index for volatility metrics and fact_ucdp_sources for citation metadata.
Collection Visualization
Collection Visualization
Standard 03

AGGREGATE

Structured & Geospatial Data

PostgreSQL + PostGISStores strict relational data, conflict statistics, and time series. PostGIS powers all geospatial logic, from boundaries to point clustering.

Aggregated Data Marts

Gold Layer (Materialized Views)Pre-calculated statistical views (e.g., country_year, global_month) stored as materialized views. Optimized for instant dashboard performance.

Real-time API Layer

SupabaseBridges the database and frontend. Generates consolidated, performant views that feed the interactive dashboard in real-time.

Standard 04

VISUALIZE

Data is useless if it's not actionable. We transform millions of data points into interactive heatmaps, temporal timelines, and cluster analyses for immediate insight.

Events are precisely located on the map using a robust geocoding hierarchy: prioritizing standard ISO codes (cca2, ccn3, cioc) and falling back to country_name_common only as a last resort.