It feels like one of the buzzwords this year has been AI agents. With the term being passed around so frequently, one has to ask if it's always used accurately. Where we used to use chatbots, do we now use an "agent"?
What's the difference between a true AI agent and a chatbot, though? It's far more than just marketing; it's about capabilities. Technically speaking, anything you can automate could be labelled an "agent”, but that misses the point. A true agent represents a shift towards proactive problem-solving; they move on from just answering questions to actually getting things done. Obviously, there's a lot of hype around this, but once that's stripped away, there's some genuinely exciting technology with much potential.
How are AI agents actually helpful?
There are many differences between a chatbot and an AI agent. Metaphorically speaking, let's look at this from the kitchen. We can think of a chatbot as a recipe: step by step, it has the ingredients list and instructions. One by one, the steps are followed, and the end result is something (hopefully) edible.
The AI agent version of this would be more like a personal chef. It can decide what to cook for you based on what you like most and your nutritional requirements for the particular day. The spice level will be tweaked to your personal taste, the ingredients sourced, and on the fly, if an ingredient is missing, it will amend the recipe before cooking the food.
To summarise this concretely: chatbots follow scripts and steps; if they get stuck, they may provide unhelpful answers. AI agents, on the other hand, use large language models, which means they can better understand human language and engage in more complex interactions. Where chatbots can respond to questions, AI agents can solve problems. One of my favourite summaries is by Talkdesk: "Chatbots talk, AI agents think."
How does an AI agent work?
At their core, AI agents follow a simple but powerful pattern: observe, plan, and do. This cycle repeats continuously, allowing agents to adapt and improve their approach in real-time.
Observe: The agent gathers information from multiple sources, such as emails, databases, and APIs, to gauge what needs to be done.
Plan: Using reasoning frameworks, the agent breaks down complex problems into manageable steps and decides on the best course of action.
Do: Agents execute their plan by taking actions, which could include updating databases or calling external APIs. After each action, they observe the results and adjust their approach if needed.
This workflow makes agents particularly powerful for complex business processes in areas like marketing research, customer service, and operational tasks requiring multiple steps and decision points.
The components of an AI agent
An AI agent's architecture combines several key components: an interface that requires no coding skills, goals that guide actions, and an agent core that coordinates everything. Memory retains context from past interactions while large language models handle reasoning and planning, converting user intent into actionable steps. Tools enable integration with external systems, and learning mechanisms improve performance over time. Autonomy allows proactive operation without constant human oversight, while master agents can orchestrate multiple specialised agents. These components work together to transform static AI models into dynamic systems that understand, plan, act, and evolve.
The three types of AI agents
There are three main types: conversational, transactional, and vertical.
Conversational Agents can interact intelligently with customers, aiding customer interactions. By accessing your documentation and maintaining context across conversations, they solve problems instead of forwarding tickets to humans. This frees up your support team to focus on more complex issues rather than answering "How do I reset my password?" for the 800th time. These agents currently work well for standard queries but struggle with unusual requests or providing empathy to frustrated customers. IBM's AskIAM is one example.
Transactional Agents are typically used for specific processes, like automatically processing invoices. They're less often customer-facing but handle complex workflows that require tribal knowledge and multiple tools. They can integrate with external systems and APIs, making them particularly powerful for automating end-to-end business processes.
Vertical Agents excel in specialised areas but often become redundant outside their domain. They're intensively trained on specific fields like finance or law, and can respond using industry-specific terminology, removing technical barriers so marketing teams can analyse user data directly without needing engineering expertise. Salient is an example.
The models behind the AI agent
Foundational Models (GPT-4, Claude, LLaMA): These models can handle varied and unpredictable requests reasonably well and are decent generalists. It's a bit like hiring a smart employee who can solve problems—they figure things out as they go. Although they're reliable and flexible, they're not specialists at any one task.
Instruction-Tuned Models (ChatGPT): These models are built on top of foundational models and are trained to follow directions more precisely. They're good for repeatable tasks where you need predictable outputs and consistency. It's like hiring an employee to follow procedures.
Domain-Specific Models (Med-PaLM, Code LLaMA): These models are trained intensively on specialised datasets so they can excel in specific domains. They're very accurate within their area of expertise, but can fall down outside of it. Consider these when accuracy is essential.
The reality of cost
Don't underestimate the cost required to power an AI agent. Every time someone interacts with the agent, computational resources are consumed, which translates into real expense. This increases with every conversation, every time a document is looked up, and when a decision is made.
If you're running a customer service agent handling 800 conversations daily, this will generate significant compute costs. If the agent needs to access your entire knowledge base to answer what are, in reality, typical questions, costs will escalate further.
It's often suggested that you periodically start new conversations when interacting with an AI provider. This isn't because they dislike long chats, it's because maintaining conversation history is expensive. They're trying to help you avoid an extreme bill.
Start with cost monitoring and choose models appropriate for your actual needs. GPT-4 Mini often handles basic tasks as effectively as more expensive alternatives.
Managing risks of AI agents
There's an elephant in the room: hallucinations. AI agents can confidently provide incorrect information, and when your customer experience and company reputation are at stake, you need to apply boundaries. Consider implementing confidence thresholds, fact-checking mechanisms, and clear escalation paths to humans for complex or sensitive queries.
The future with AI agents
Multi-Agent Systems: Instead of one generalist, teams made up of specialised agents work together rather than one generalist trying to handle everything. This is better than having a 'super' agent that does nothing well.
Industry Specialisation: Similar to the super agent doing nothing well, these will be trained specifically for industries like healthcare or finance, as opposed to general-purpose tooling.
Improved Reasoning: Making better decisions and ultimately handling more complex business logic and edge cases.
Implementation
The end goal isn't to create the most intelligent agent imaginable, but to find ways to solve problems for businesses efficiently. Start this process by looking at something focused that's currently sinking time for your employees and something you can build on.
Before scaling, test thoroughly with everyday scenarios. You'll also want to use established providers rather than building something from scratch, which can sink your resources and divert you from the problem you were initially trying to solve.
AI agents aren't magic, but they're a step towards a newer future. They have limitations but can excel at certain tasks. Understanding both their capabilities and limits can help you solve real problems. What's hype today will remain, alongside useful solutions.
Member discussion