|
| The Hallucination Fix Last week's newsletter (Stop Microwaving Your AI) was about you and how you can become more skilled with AI, no matter your experience. Worth a read if you haven't seen it.
This week, we're shifting focus to your company. How do you level up your business using AI? The AI your team uses on their desktops and phones isn't the AI your company will run on. Your corporate AI needs to know your data and follow your rules. Retrieval-Augmented Generation (RAG) is the tech that makes that possible. It's often pitched as the fix for AI's hallucination problem, and we're going to take a closer look at what RAG is, how it works, where it breaks, and what the new large-context models mean for your architecture. | Large language models are prediction engines. They generate the most probable next word based on training patterns. And probability isn't accuracy. In some benchmarks, even foundational models hallucinate on one in three responses, and global business losses from AI errors reached $67.4 billion in 2024. Search tools can pull in web results, yet nothing can tie the model's output to information that actually runs your business. RAG changes that. Instead of responding from memory or an internet search, a RAG system retrieves relevant documents from trusted business sources and gives them to the model before it responds. Research suggests well-implemented RAG dramatically cuts hallucinations. Key Insight: RAG is a real solution for grounding AI in your business knowledge. Yet you can't point it at a data lake and hope. Success requires clean data, iterative review cycles, and humans in the loop. |
Retrieval-Augmented Generation is a pipeline: user asks, RAG retrieves, model answers. Agentic RAG is a power-up. The agent decides the best way to answer the question, which source to retrieve from, and when to go back for more. It also decides when to use MCP or an API, run code, or even check its own output. Agentic RAG is another example of AI orchestration; it can handle multistep tasks that used to require human hand-offs between systems. The Practical Angle: RAG puts your organization's knowledge inside AI. Agentic RAG takes it a step further. Pick one workflow where the integration between systems is the bottleneck, scope the agent's tools and permissions narrowly, and build trust before you expand. Read the Breakdown → Perplexity just shipped Computer for Taxes, and it's a working example of RAG in production. Rather than ask a frontier model to remember tax law (which changes constantly), they packaged IRS materials and current regulations as loadable modules. In testing, Perplexity reported that the system caught a 67% understatement of deductions in an attorney-prepared return. This pattern (frontier LLM + tools + domain-specific retrieval) is one to watch. Strategic Takeaway: This is an excellent example of the RAG playbook. The same pattern works inside your business. Identify the regulated, fast-changing, or expertise-heavy work where generic AI keeps falling short, and build the module around your sources. See Perplexity's Announcement → | Quick Hits.Foundations RAG in a Nutshell A very well-written primer on what RAG is, where it works, and where it fails, written for everyone. What is retrieval-augmented generation? You know what the words mean but how do they apply in an AI workflow? If you want a strong baseline, start here. | .Video RAG Versus Copy + Paste Why do I need RAG? Can't I just copy and paste the relevant PDFs into my chat? This video compares RAG and long-context models (where you can fit entire document sets in a single prompt). Three reasons each approach wins, and a framework for when to use which. | .Deep Dive What Production RAG Actually Takes An excellent checklist of what a real RAG system has to get right: chunking strategy, embedding model, vector store, evaluation framework, refresh pipeline, and re-ranker. The product pitch may not be relevant, but use it as a gut-check for your team's RAG plan. → Downloadable RAG Playbook |
|
| Industry DevelopmentsAnthropic Releases Cloud-Hosted Agents Managed Agents are a new set of APIs from Anthropic that handles agent infrastructure (sandboxing, sessions, permissions, tracing). Addressing permissions and identity management natively allows businesses to ship production agents much more quickly and effectively. | AI Designs Rust-Proof Steel Researchers used AI to design a new corrosion-resistant high-strength steel alloy in a fraction of the typical R&D time. It's interesting to note that the same underlying techniques that are compressing software workflows are also reshaping materials science. |
|
| |
|