ADIA Lab x Crunch: The Structural Break Challenge Returns

Most predictive systems rest on a quiet assumption that the world they are modelling tomorrow will behave like the world they modelled yesterday. Data is just the trace of that world. When the data-generating process actually changes, the model assumption breaks, and every forecast breaks with it. That moment of change is a structural break. Detecting structural changes in time-series data in real time is a critical task across various scientific and engineering domains.

This Structural Break Challenge turns that problem into a 5-months scientific race on CrunchDAO. Given a time series, did the underlying behaviour shift at some point, or is the variation just noise? The answer matters wherever regime shifts carry consequences, from failing equipment to shifting markets regimes to climate signals breaking from historical patterns.

The Challenge has already run through two phases. It launched in May 2025 as a time-limited competition, where participants received each time series in full and decided whether a break had occurred at a known boundary. Late that year, the problem was extended into a continuous public benchmark for an extra quarter with top models from the original competition locked in as reference baselines and an originality filter to prevent solution cloning.

The 2026 edition keeps the same scientific question and changes the mechanics fundamentally. Models are scored on AUC. 50% is random guessing. 100% is perfect detection. Everything in between is signal.

What changes in 2026

In the previous editions, participants received the entire time series upfront and decided, after analyzing everything, whether a break had occurred at a known boundary. In the 2026 edition, participants receive a historical reference segment at the start, but the part of the series where a break may occur arrives one observation at a time.

After each new observation, participants output a score between 0 and 1 representing their cumulative confidence that a break has already occurred. Zero means no break detected. One means a break has definitely happened. There is no way to look ahead. The break, if it occurs, can happen anywhere in the online segment. Half the series contain a break. Half do not.

2025 Edition2026 Edition
Data deliveryBoth segments given at onceOnline segment arrives one step at a time
Break locationAlways at a known boundaryUnknown, anywhere in the online segment
OutputOne score per seriesOne score per time step

Why the format matters

Structural breaks in the real world do not announce themselves at a known boundary. They occur in the middle of ongoing data, and the cost of missing them is paid on every subsequent observation that the model treats as normal. A climate model that does not detect a regime shift keeps forecasting with outdated assumptions. A sensor system that does not flag a change keeps reporting health on failing equipment. A risk model that does not register a market regime shift keeps sizing positions against a distribution that no longer exists.

The 2026 edition reframes the problem from retrospective classification to sequential detection. Participants have to commit a score at every step with only the data they have seen so far.

The setup

Each time series has two parts:

  • A historical segment of 1,000 to 5,000 observations, given in full at the start. This represents the behaviour of the series before any potential break.
  • An online segment of 10 to 1,000 observations, revealed one at a time. A break may or may not occur here.

The training set provides full supervision: the exact break position for each series is included, so participants can train end-to-end on labeled (series, time step) pairs.

Dataset

The competition uses univariate time series exhibiting many different kinds of structural breaks: changes in mean, variance, distribution shape, correlation structure, and more. All series are pre-processed into a common z-scored format. The dataset contains more than 10,000 series in each of the three splits: public training, public test, and private test.

The problem applies across domains where regime change matters: climatology, industrial monitoring, healthcare, and finance.

Evaluation

Scoring uses Time-Stratified AUC (TS-AUC). At each step, models are graded on how well they separate series that have already broken from those that haven’t, then those grades are averaged into a single score. 0.5 is a coin flip. 1.0 is perfect.

Methodology

The online format opens the door to a range of approaches: statistical tests comparing the historical segment to accumulated online observations, change-point detection algorithms designed for streaming data, incremental feature extraction feeding a trained classifier, Bayesian models tracking the probability of a change sequentially, deep learning models trained on labeled (series, time step) pairs, and time series foundation models used as feature extractors or fine-tuned on the training data.

One practical note: with 10,000+ series and up to 1,000 steps each, solutions that recompute from scratch at every step may run into time budget constraints. Incremental approaches have an efficiency advantage.

In partnership with ADIA Lab

The ADIA Lab Online Structural Break Challenge 2026 is the next edition of the research programme run jointly by ADIA Lab, the Abu Dhabi-based independent research institution focused on data science and AI, and Crunch Lab.