§ 1 · Overview

The 90-Day Lead Time Problem

Existing humanitarian early warning systems, including FEWS NET and IPC cadres, provide effective lead times of 30–45 days before a food crisis reaches emergency thresholds. Pre-positioning food aid, mobilising logistics, and securing emergency funding through multilateral mechanisms requires a minimum of 60–90 days.

CERES is designed to close this gap. It produces falsifiable, probabilistic 90-day forecasts of acute food insecurity, expressed as P(IPC Phase 3+), the probability that a monitored region will reach crisis-level hunger within 90 days, with explicit 90% sensitivity intervals.

Operational Scope

CERES predictions are research outputs intended to augment, not replace, field-based IPC assessments and expert humanitarian judgement. All forecasts carry explicit uncertainty quantification and should be interpreted alongside ground-truth verification.

The system ingests six core data streams covering rainfall (CHIRPS), vegetation (MODIS NDVI), conflict (UCDP GED), acute food insecurity phase (IPC), food access (WFP VAM), and market prices (FAO GIEWS), with FEWS NET outlooks used as a supplementary cross-check. These are synthesised by the Hypothesis Generation Engine (HGE) into ranked driver hypotheses, which feed a calibrated logistic model producing probabilistic risk scores at national level across 43 high-risk countries.

§ 2 · Pipeline Architecture

Six-Stage Processing Pipeline

Each pipeline run proceeds through six sequential stages. Runs execute weekly, every Monday at 06:00 UTC. The run identifier (e.g. CERES-20260302-060000) is recorded with every prediction, enabling complete reproducibility and audit.

Signal Ingestion

Six data adapters ingest satellite, conflict, market, and displacement streams. Each adapter normalises its source to a shared 0.25° spatial grid (~28km cells) and ISO-week temporal cadence. Raw data is cached with source provenance and retrieval timestamp.

Stress Scoring

Per-Admin1 composite stress scores are computed as a weighted sum across six sub-scores: drought stress, vegetation anomaly, conflict intensity, food access, IPC phase, and market price deviation. Weights are set as expert-informed priors, initialised against 87 IPC transition records (2011–2023), and will be updated via MLE as the prospective verification archive accumulates.

HGE: Hypothesis Generation Engine

Elevated signals are clustered into ranked driver hypotheses. Each hypothesis identifies a primary causal mechanism (e.g. conflict-driven market failure), supporting evidence records, and a confidence weight. Up to three hypotheses are generated per region per run.

Probabilistic Forecast

A logistic regression model converts composite stress scores into P(IPC Phase 3+) at a 90-day horizon. Input-perturbation sensitivity intervals are generated by resampling input signals, producing 90% sensitivity intervals. Both the point estimate and the full interval are reported for every prediction.

Tier Classification

Predictions are assigned to one of three alert tiers based on probability thresholds. Tier assignment triggers downstream alerting and determines the urgency framing in intelligence reports.

Grading & Calibration

At T+90 days, each prediction is graded against the published IPC outcome for that region-month. Brier scores, sensitivity-interval (SI) coverage, and precision/recall metrics are updated continuously. Calibration failures trigger model review.

§ 3 · Hypothesis Generation Engine

The HGE: From Signals to Hypotheses

The Hypothesis Generation Engine (HGE) is the core intelligence layer that distinguishes CERES from threshold-based early warning systems. Rather than flagging when a single indicator crosses a threshold, HGE synthesises multi-source signal convergence into causal hypotheses: ranked, evidenced explanations of why risk is elevated.

Signal Convergence Detection

HGE monitors for simultaneous stress elevation across independent data streams. When two or more signals from different domains (e.g. CHIRPS rainfall deficit + UCDP GED conflict events + WFP VAM food access deterioration) converge on the same Admin1 region in the same time window, this constitutes a convergence event: a materially stronger signal than any single indicator in isolation.

Hypothesis Taxonomy

Each convergence event is classified into one of four primary causal archetypes:

Archetype	Primary Signals	Typical Regions
Conflict-driven	UCDP GED, IPC	Sudan, Somalia, Yemen, South Sudan
Climate-driven	CHIRPS, MODIS NDVI, FAO GIEWS	Sahel, Horn (off-conflict seasons)
Economic/market	WFP VAM, FAO GIEWS prices	Urban centres, import-dependent regions
Multi-causal	All six core inputs	Active conflict zones with drought overlay

Evidence Records

Every hypothesis is grounded in structured evidence records: individual signal observations that either support or contradict the hypothesis. Each record specifies source, variable name, observed value, baseline threshold, deviation direction, and a binary support/contradict verdict.

Design Principle

HGE never produces a prediction without an auditable evidence chain. Every probability estimate has a traceable hypothesis. Every hypothesis has traceable evidence records. This is a deliberate design constraint: it is what makes CERES predictions defensible to institutional reviewers.

§ 4 · Probabilistic Model

Logistic Model & Sensitivity Intervals

CERES uses a calibrated logistic regression model to convert composite stress scores into IPC Phase 3+ exceedance probabilities at a 90-day horizon. The choice of logistic regression is deliberate: it is well-understood, natively probabilistic, and produces outputs that are straightforwardly interpretable by non-technical reviewers.

The model coefficients were fit on a smaller Horn-of-Africa and Arabian-Peninsula set, initialised against 87 IPC transition records across 31 countries (2011–2023), and are deployed across 43 high-risk countries; forward performance on the full deployment set is established prospectively, not assumed from the initialisation fit.

Core Model

P(IPC 3+ | X, t+90) = σ(β₀ + β₁·CSS + β₂·conflict + β₃·NDVI_anomaly + β₄·rainfall_SPI + β₅·IPC_current)

where σ is the logistic function, CSS is the composite stress score,
and coefficients β are expert-informed priors initialised from 87 country-season IPC transition records (initialisation AUC = 0.84 on a held-out split, not a settled forward-accuracy claim). Full coefficient values are published in the arXiv paper and open-source repository.

Sensitivity Intervals

Point estimates alone are insufficient for humanitarian decision-making. CERES generates 90% sensitivity intervals via input-perturbation resampling: input signals are resampled within their stated uncertainty bounds across n=2,000 replications, and the resulting distribution of predictions forms the interval. This captures data variability and model sensitivity to input uncertainty.

Sensitivity Interval Construction

SI₉₀ = [P̂₅, P̂₉₅] where P̂ₖ is the k-th percentile of the input-perturbation distribution
n_replications = 2,000 · Empirical SI coverage target: ≥88% (populating as graded outcomes accumulate from Aug–Oct 2026)

Coefficient Update, 11 March 2026

Following live deployment, the initial coefficients (arXiv v1) were found to produce logit saturation across the monitored-country distribution: all 43 countries returned P(IPC 3+) > 0.99, eliminating discriminative utility. The intercept was adjusted from −2.10 to −4.50, composite_stress β from 5.80 to 3.20, convergence_score β from 2.20 to 1.40, and n_independent β from 0.40 to 0.20. The recalibrated model produces meaningful separation across the full CSS range (P = 0.036 at CSS = 0.15 through P = 0.994 at CSS = 0.85). Updated coefficients will be reflected in arXiv v2, planned alongside first prospective validation results Q3 2026.

§ 5 · Tier Classification

Alert Tier Definitions

Predictions are assigned to one of three alert tiers based on the point estimate of P(IPC Phase 3+). Tier thresholds are calibrated to IPC phase transition probabilities estimated from the validation dataset.

Tier I · Critical

≥ 70%

IPC Phase 3+ (Crisis or above) probable within 90 days. Immediate humanitarian pre-positioning recommended.

Tier II · Warning

45–69%

IPC Phase 3 (Crisis) likely within 90 days. Enhanced monitoring and contingency planning indicated.

Tier III · Watch

< 45%

Elevated stress signals. Situational monitoring and early preparedness recommended.

Important

Tier I classification does not constitute a famine declaration. Only the IPC Global Platform, through its established cadre process and field verification, has the mandate to declare famine (IPC Phase 5). CERES Tier I indicates a probability of reaching Phase 3 or above.

§ 6 · Validation & Calibration

Model Performance

CERES is initialised against 87 IPC transition records spanning 31 countries between 2011 and 2023. Four prospective performance targets are set: T+90 grading windows open June 2026, and because observed IPC outcomes publish on a 2–4 month lag, the first graded values are expected Aug–Oct 2026.

Brier Score

Pending

Target <0.10. T+90 grading windows open June 2026; first graded values expected Aug–Oct 2026.

SI Coverage (90%)

Pending

Target >88%. Empirical coverage of the 90% sensitivity interval, populating as graded outcomes accumulate from Aug–Oct 2026.

Tier-I Precision

Pending

Target >80%, populating once 30 Tier-I alerts have been graded (from Aug–Oct 2026).

Tier-I Recall

Pending

Target >85%, populating once 10 IPC Phase 4+ events have been graded (from Aug–Oct 2026).

§ 7 · Limitations

Known Limitations & Constraints

Data Latency

Several input streams (IPC, FAO GIEWS) are updated bi-annually or monthly. Between updates, predictions rely on interpolated or lagged data, which may not capture rapidly deteriorating situations driven by sudden shocks (conflict escalation, flash flooding).

Admin1 Resolution

Predictions are generated at Admin1 (provincial) level. Intra-provincial heterogeneity, particularly in large regions like Oromia (Ethiopia) or Jonglei (South Sudan), may be significant. Admin1 classifications mask sub-national variation that field assessments would capture.

Model Transferability

The logistic model is fit on a small Horn-of-Africa and Arabian-Peninsula set and initialised against 87 IPC transition records across 31 countries; it is deployed across 43 high-risk countries. Performance in geographically or structurally distinct contexts (South Asia, Central America) is established prospectively and should not be assumed.

Conflict Dynamics

Conflict data is sourced from UCDP GED (Uppsala Conflict Data Program, Uppsala University). UCDP GED captures georeferenced conflict events with approximately 3 days reporting lag for candidate data. In active conflict zones, the most acute areas may be the least reported. CERES may systematically under-estimate risk in media-dark conflict environments.

Transparency Commitment

This limitations section is intentionally complete. CERES is an open system. Reviewers, funders, and operational partners are encouraged to scrutinise these constraints and communicate additional concerns to the Northflow research team.

§ 8 · Citation

How to Cite CERES

If you reference CERES predictions or methodology in published work, please use the following citation format. The paper is published on arXiv and freely available under CC BY 4.0.

Preferred Citation

Pedersen, Tom Danny S. (2026). CERES: A Probabilistic Early Warning System for Acute Food Insecurity. arXiv:2603.09425 [stat.AP]. https://arxiv.org/abs/2603.09425

How CERES Works

The 90-Day Lead Time Problem

Six-Stage Processing Pipeline

The HGE: From Signals to Hypotheses

Signal Convergence Detection

Hypothesis Taxonomy

Evidence Records

Logistic Model & Sensitivity Intervals

Sensitivity Intervals

Alert Tier Definitions

Model Performance

Known Limitations & Constraints

Data Latency

Admin1 Resolution

Model Transferability

Conflict Dynamics

How to Cite CERES