Your AI logs show who used it. They don't show what it remembered.
We spent a weekend running Ogham through someone else’s benchmark. Here’s what happened.
Last week we published that Ogham hits 99.5% Recall@10 on LongMemEval – the right memory chunk lands in the top 10 results for nearly every question. Good number. We were pleased with ourselves.
Then we ran the same 500 questions through the AMB benchmark harness, built by the Vectorize team (the people behind Hindsight), where a strict LLM judge scores the final answer – not just whether we found the right chunk.