Agent-based setups are starting to edge out old-school RAG. As LLMs snag multi-million-token context windows and better task chops, the need for chunking, embeddings, and reranking starts to fade. Claude Code, for example, skips all that - with direct file access and smart navigation instead. Retrieval isn't dead, but it's morphing into something far more agentic.
Bigger picture: Bigger windows and sharper attention mean LLMs can now process whole documents and run tasks directly - no more stitching together fragments just to get work done.