Editorial
Insights
Technical essays by the researchers. Not tips, not trends — methodological reflections and regulatory readings.
Why agentic systems need provenance more than humans do
A human researcher with weak provenance discipline produces poor research. An agentic LLM system with weak provenance discipline produces confident hallucinations at scale.
Building eval harnesses that survive contact with production
A benchmark score is what a model achieved on a frozen dataset. An eval harness is what you trust to decide whether a new model can ship. The two are not the same thing.
RAG isn’t search — and treating it like search is why most implementations fail
Retrieval-augmented generation looks like search-with-an-extra-step. It is not. It is generation-with-a-retrieval-substrate. The architectural implications are different at every layer.
Why provenance is a delivery practice, not a deliverable
Provenance is not a section at the end of a report. It is a practice that shapes how research is conducted from day one.
The LLM-augmented research workflow we use internally
Most consultancies hide their internal AI use behind a curtain of human authorship. We think the curtain is doing more harm than good — to clients, to the discourse, and to our own work.
Reading NIS2 as an engineering specification
Most compliance guides treat NIS2 as a checklist. We read it as an engineering spec — and the design implications are different.
The cost of arms-length offshoring in applied research, and the alternative
The failure mode of offshoring is not cost — it is the translation layer between the bench and the client.