The problem was never the architecture, the RAG, or the technology. It was the lack of project context.
A company I work with had already made the buy call before I got involved. They had paid well for an AI tool that generated training content — a competent RAG system, the kind every vendor is shipping now. For a while it earned its price. The material it produced for broad, general courses was genuinely good, and buying rather than building had been the right instinct.
Then the work got specialized. We moved from generic learning into detailed pharmaceutical training, where being roughly right is the same as being wrong. The answers started coming back too generalist. The retrieval pulled something every time, fluently and confidently, but rarely the information that actually mattered for this material. The tool that had looked sharp on easy content looked thin the moment the stakes went up.
The reflex in that moment is to blame the machinery. Swap the vendor, tune the retrieval, reach for a bigger model. I have watched teams burn a quarter doing exactly that. But the architecture was fine. The RAG was fine. The model was fine. What the system was missing was not intelligence. It was context. It had the content documents and nothing else, and it did not know what this particular training was actually for.
That knowledge existed — it was just scattered. It lived in the contract intake, the kickoff, the stakeholder meetings, the SME material: which behaviours the training was meant to change, what the stakeholders genuinely expected, where the real people being trained were struggling. So we did not rip the tool out and buy a different one, and we did not try to out-engineer the retrieval. We built the missing layer — a graph database that links a project’s information end to end, so that when we generate the material the system is standing on everything that defines what good looks like here.
That reframed the whole problem. The problem was never the architecture, the RAG, or the technology. It was the lack of project context, and missing context produces poor deliverables no matter how strong the model behind them is. Before, the tool had documents and made more documents; they read fine until someone who knew the domain looked closely. After, it was grounded in the project’s own intent, and the choices it made started matching what the work was really asking for.
The economics changed shape too, almost as a side effect. The tool we had bought ran about $40,000 a year for a single seat. What replaced it is a data platform the company owns outright and the whole staff can draw on — the only standing cost now is the cluster it runs on. We had been renting one narrow window into our own knowledge; building the context layer turned it into something the business keeps.
If I were starting that engagement again, I would not open by evaluating AI tools at all. I would open by asking where the project’s context lives, and whether it lives anywhere a system could reach. Build-versus-buy is rarely model-against-model. You can buy the generation; you almost always have to build the context around it. Most of the time, “the AI isn’t good enough” turns out to mean “the AI cannot see what we already know.”