Research

Bayesian Rips Active Learning (BRAL)

Topology-aware active learning for rare lineage discovery.


Overview

Bayesian Rips Active Learning (BRAL) is a framework that integrates topological data analysis with active learning strategies to discover rare lineages in high-dimensional biological datasets.

Motivation

Traditional active learning methods rely on geometric or density-based heuristics that struggle with manifold-structured data. BRAL leverages persistent homology from the Rips complex to identify topologically significant regions where rare lineages are likely to reside.

Method

Rips Complex Construction

We construct a Vietoris-Rips complex from the current labeled set, computing persistent homology to identify topological features (connected components, loops, voids) that persist across multiple scales.

Bayesian Acquisition Function

The acquisition function combines:

Active Learning Loop

At each iteration:

  1. Compute the Rips complex on labeled data
  2. Extract persistent homology features
  3. Score unlabeled points using the topology-aware acquisition function
  4. Query the oracle for labels on the top-k candidates
  5. Update the model and repeat

Results

BRAL demonstrates improved discovery rates for rare lineages compared to standard active learning baselines, particularly in settings where the rare class occupies a topologically distinct region of the data manifold.