Original publication: Embarking on a Personal Journey into Agentic AI (LinkedIn)
Embarking on a Personal Journey into Agentic AI
Timothy Matthis
AI-Focused Executive | Exited Martech Founder | Former Consulting Managing Partner | Executive MBA
November 11, 2024
What Building an Agent Framework in LangChain Has Taught Me
For the past few months, I've been on a personal journey, deep in the world of Agentic AI, exploring frameworks, testing concepts, using code GPTS and just generally becoming an AI nerd. I've been captivated by the idea of developing genuinely interactive, multi-functional agents. I'm not just talking about bots—I'm envisioning a virtual team capable of driving real industry impact, where a single person could leverage AI to achieve unicorn-level disruption.
I started with a simple use case; I wanted to prove that a few agents could repeatedly outperform an experienced human in specific domain, but as I explored, the scope expanded—revealing an entire landscape of possibilities that a robust, adaptable AI agent framework could bring to life. I've certainly gone down a few rabbit holes, but the potential is exhilarating; and tantalising just out of reach still.
In these upcoming posts, I'll break down my learnings: the types of AI agents, examples I've built, and insights I've gained in creating what I call an "AI-Everything Orchestration Model."
So, What Exactly is an AI Agent?
Here's a six-level breakdown to help simplify it (note: these levels are my own, just an attempt to make the concepts easier to grasp):
Level 0 - Context-Enhanced LLM
In the journey to build AI agents, Level 0, the Context-Enhanced LLM, is where most of us start. It's the simplest form: an LLM like GPT with added, task-specific prompts to make it useful within a particular context. Here's the trick—by setting customized context through prompts or frameworks (like COSTAR), we create a model that's good at a single, focused purpose.
In my own exploration, this level reveals the first major frustration which continues through more complex builds: memory degradation. As interactions get longer, these models seem to forget prior responses, becoming repetitive or losing the "human" feel. There is something called the context window, but it has major limitations and functions nothing like human memory - the net effect is that the user experience declines with us as the conversation or interaction with an Agent progresses.
I foresee AI reaching new heights when models can handle intentionality and continuity more like the human mid does. LLM's are stateless which basically means they do not learn. The mind is ever evolving and for the promise of AGI to be delivered this aspect of memory, is going to be a competitive battle ground in the near future.
But for now, the Level 0 agent is a foundational, engaging experience that serves as the AI starter kit—albeit one with a clear ceiling on complexity.
Level 1 - LLM + Basic Retrieval-Augmented Generation (RAG)
When you're ready to take your agents a step further, Level 1 introduces Retrieval-Augmented Generation (RAG). Here, we see a meaningful step-up from Level 0: memory capacity meets dynamic knowledge. Now the LLM doesn't just rely on what it was originally trained on—it can access up-to-date external information, enriching its responses.
In Level 1, agents tap into external knowledge sources (often called "embeddings") that can be updated or customized, making them ideal for industries that need real-time, factual responses. For instance, customer service agents at this level pull answers from current databases or FAQs rather than static prompts. And it's not just information but accuracy—the model can be set to cite or prioritize certain data sources, adding a layer of trust.
If you've used a chatbot with surprisingly relevant, real-time answers, you've seen Level 1 in action. While exciting, this level still has limits, especially when it comes to extended, complex conversations. But for basic informational queries, RAG provides a valuable upgrade that makes Level 1 agents far more adaptable than those at Level 0.
Level 2 - Advanced RAG with Richer Context
When basic RAG doesn't quite cut it, Level 2: Advanced RAG with Richer Context comes into play. Here, we move from simple memory to richer, more nuanced interactions. This level builds on RAG by retaining more in-depth context, especially for prolonged, specialized conversations.
Take, for instance, a customer service agent that can recall a user's prior interactions, preferences, and even previous inquiries. Advanced RAG helps the model "remember" key details, creating a continuity that's ideal for customer experience but demands complex data management. Standard "off-the-shelf" embeddings won't work here—you need a custom database structure that stores historical data in a way the AI can easily retrieve.
Microsoft's Turing-NLG is a notable example, layering context to maintain relevance in long conversations. As a result, Level 2 agents move closer to human-like memory, providing a more seamless experience that feels less like a series of discrete exchanges and more like an actual conversation. Database architects have a big role here—designing efficient, scalable solutions for context management is what enables these richer, longer-lasting interactions.
Level 3 - RAG + Context + Functional Tool
The shift from Level 2 to Level 3 is profound, adding a layer of functional capability to the mix. Now we're equipping our RAG and context-aware agent with a specific tool that lets it perform actions, moving beyond language-only interactions to actual functionality. Think of it as moving from think to do.
For example, imagine an agent that can not only provide data but also calculate complex figures or retrieve specific, real-time information like stock prices. This is LangChain's "agent with tools"—essentially a digital assistant with one area of expertise, powered by specialized libraries or APIs. The integration is complex, though; the API landscape is notoriously fragmented, with varying standards and protocols.
As a result, functional agents require sophisticated integration and often custom-built solutions to work smoothly across APIs. This level hints at a future where agents could routinely perform complex, multi-tool tasks in real-time. The promise of "agentic hardware" like Rabbit R1 will only be fully realized when agents can seamlessly navigate and integrate multiple tools.
Level 4 - Orchestration Frameworks
If Level 3 introduced tools, Level 4, the Orchestration Framework, brings in coordination. This is where the magic of Agentic AI really starts—agents here don't just use tools; they manage multiple tools and tasks dynamically. Picture an agent acting as a manager, orchestrating a whole team of specialized employees to complete a multi-step workflow or fulfill a complex goal.
LangChain and CrewAI are great examples of orchestration frameworks that handle these complex workflows. For instance, a travel agent at this level could book flights, compare hotel options, and even reserve a car rental in one seamless interaction. The analogy of a project manager is fitting—this level of AI can handle not just individual queries but complex, goal-oriented tasks.
Imagine a marketplace of specialized agents, each capable of performing different parts of a larger task, available on a per-use basis. This level of agentic AI hints at a new era of productivity where AI is a genuine partner, managing multiple steps to deliver a finished outcome without constant user guidance. The future of computing may very well rely on this level of orchestration. It may be beneath the hood, but I suspect a lot of what enables the next generation of our experience with computers will be this.
Level 5 - Fine-Tuning with Specialized Data
The next layer of complexity of Agentic AI is Level 5: Fine-Tuning with Specialized Data. At this level, the agent isn't just adaptable—it's purpose-built for a specific field, fine-tuned with proprietary data that gives it a competitive edge. Fine-tuning modifies the model's underlying weights to respond in ways that are highly relevant to its intended domain, creating a level of expertise unattainable with general-purpose models.
Consider OpenAI's fine-tuning service or BioGPT, which specializes in biomedical data. At Level 5, the AI can pivot into specialized domains like finance, healthcare, or logistics, thanks to access to unique, data-rich sub-models. And with the integration of sub-models, we can route specific types of queries to mini-models optimized for particular tasks—think advanced math, legal reasoning, or language translation. This allows agents to operate with a higher degree of precision and accuracy.
The future impact? Companies that master fine-tuning will develop agents with an unmatched depth of knowledge in their fields, creating proprietary solutions that are difficult, if not impossible, to replicate. This level of agentic AI turns agents into specialists, giving businesses a true competitive advantage and a means to develop proprietary AI applications that become central to their industry.