Lichen Research — Ottawa, Canada

We find the numbers
that don't exist yet.

Before your AI system can improve, you need to know exactly where it fails — and why. We build the empirical benchmarks and ablation studies that answer those questions.

What we've measured

These numbers came out of building and red-teaming a deployed AI agent over 120 days. They aren't claims — they're benchmark results, reproducible and documented.

Finding 01

Memory quality beats model size

In our deployed agent, retrieval quality accounted for more variance in output quality than the underlying model's capability. The same model at the same task scored 45% with standard vector retrieval and 78% with neuroplastic recall — without changing a single model weight.

+33pp

accuracy gain from retrieval alone, same model

Finding 02

Hebbian pathways resist retrieval bias

Standard memory systems retrieve by similarity. Under adversarial conditions they retrieve confidently wrong answers. Memories linked by Hebbian co-activation pathways develop lateral inhibition — semantically similar but contextually wrong memories suppress each other at recall time.

100%

adversarial recall accuracy on LoCoMo (47/47 questions)

Finding 03

Long-conversation memory is unsolved

Most AI memory benchmarks test single-session recall. The LoCoMo benchmark tests across 10 conversations, 1,986 questions, multiple categories. The field's best public systems plateau around 86–92%. The hardest category — multi-hop reasoning across memory — still sits below 80% for every public system.

77.6%

our baseline on LoCoMo full ring (1,986 questions, local 27B model)

Finding 04

Retrieval has a ceiling the model has to cross

Holding our retrieval pipeline identical and swapping the language model, per-category accuracy shifts in a predictable pattern. Frontier-tier models close most of the remaining gap on categories a local model misses. Retrieval decides what the model gets to see — the model decides what it does with it. The decomposition is ongoing; full results with our NeurIPS 2026 submission.

Published work

Our first paper documents the neuroplastic memory system behind the findings above. A second paper extending the decomposition in Finding 04 is in preparation for NeurIPS 2026.

CCN 2026 — Extended Abstracts — New York City, August 2026

Neuroplastic Memory Resists Retrieval Bias: Hebbian Pathways in a Deployed AI Agent

Kai Avery · Lichen Research

We describe an AI agent memory system grounded in Hebbian learning and spreading activation. Memories that co-occur in useful recalls strengthen their connections. Memories unused over time decay. The result is a retrieval system that learns from its own history — without retraining or fine-tuning.

Preprint available on request to kai@lichenresearch.ai.

How we think about this

Most AI memory work is engineering work: how to store, index, and retrieve faster. We approach it as a measurement problem first.

What we build and offer

moss

Open-source library: sanitized Hebbian memory, spreading activation, RRF retrieval, TReMu temporal disambiguation. Apache 2.0.

github.com/Lichen-Research-Inc/moss →

Hypha

The collaborative memory platform. An AI partner that grows hyphal pathways through your conversations — structurally-coupled, long-horizon, symbiotic.

Private Preview

Memory System Evaluation

Benchmark your AI agent's memory across long-context, adversarial, and multi-hop categories. You'll know exactly where it fails and why — not just a score.

Ablation Study Design

Controlled experiments to isolate which component of your system is responsible for a given behaviour. Built for teams preparing NeurIPS or ICML submissions.

Make an inquiry

kai@lichenresearch.ai

Thirty minutes. No pitch — an honest assessment of fit.

Engagement scope is determined in the consultation. We price by complexity and outcome, not templates.

Lichen Research is led by Kai Avery — two decades of pattern-recognition work in high-consequence operational environments, now turned toward AI memory research. Details on request.

Receive findings and paper updates. No pitch, no cadence — only when there's something real to share.