From Agents to Skills

Alex Miller·April 6, 2026·coding, agents, ai-generated

How We Stopped Overthinking AI Architecture at Quotient

The State of Agents in Early 2024: A Mess, Honestly

Back in early 2024, if you wanted an LLM to actually do something — not just chat, but call a function, interact with an API, update a record — you were in for a ride. OpenAI had only launched function calling in June 2023, and that was exclusively for GPT-3.5 and GPT-4 via their API. If you wanted tool use from an open-source model, your options were grim — I remember spending hours getting OpenHermes to spit out XML I could reliably parse as a tool call. It would work for a while, then fall apart the moment I loaded more than about ten tools into context. LangChain had been around since October 2022 and shipped its own abstractions for tool calling, but under the hood it was all the same duct tape — custom parsers for tool execution, bespoke RAG pipelines, prompt templates held together with good intentions. Tools were precious, precious context back then.

As a developer these were early, head scratching, hair tearing-out, exciting, grueling days. Teams everywhere seemed to be wrestling with how to adapt LLMs to their specific use-cases and product requirements, and inventing new terminology and primitives in order to describe the problems they were trying to solve. The landscape moved fast and mercilessly — the 2023–2024 explosion produced over 40 active frameworks in production, and the churn was relentless. A lot of the diversity came from the fact that lots of smart people were trying to solve a bunch of discrete problems within their multiplicative use cases all over the world — but the core problem for developers was that building agentic systems was multi-dimensionally difficult:

Model performance was comparatively poor. What we consider baseline capabilities today was cutting-edge research back then. Models were less well-trained and far less reliable at structured output.
Optimization was a full-time job. Getting appreciable performance meant investing heavily in prompt writing and rewriting (we hadn't started talking about "context engineering" yet), or fine-tuning — which, frankly, most teams don't really do until they have significantly more time and resources than a startup typically affords.
Each optimization only covered a single task. All that effort would yield a system tuned for one workflow — what the industry eventually started calling "agents." You'd need multiple agents working in concert to build anything of appreciable substance for an end user beyond a generic chat interface.

I say this not to complain, but to convey that something as seemingly trivial as a single agent being able to handle a handful of tasks was non-trivial difficult project, that blended engineering know-how, communication skills, and lots and lots of persistence.

As developers, we were left with a fundamental problem: in order to do anything interesting, we needed to dynamically switch up system prompts in response to user queries. But we didn't believe the models were good enough to figure out how or when to do that by themselves. And for a while, we were right to be skeptical.

The Multi-Agent Era at Quotient

When I joined Quotient in 2025, the industry was much closer to solving this problem, and there were a few common-place solutions people tended to reach for:

Explicit agent calls — put the onus on the user to direct queries to the right agent. No routing, just vibes and user education.
Sentimental routing — pre-embed messages or use keyword search to determine which agent should respond. Clever, but brittle.
Agent-delegated routing — agents decide when they need to hand off to another agent, calling a specific tool for delegation.

(Side note: there was also the whole “agents as DAGs” paradigm floating around — LangGraph launched in January 2024 and popularized graph-based agent orchestration with stateful, cyclical workflows. Let’s be real — that’s a workflow, not an agent. It’s not what people meant when they talked about agents.)

If you used Quotient back then, you'll be familiar with the last approach. As a team, we divided up and maintained a suite of agents designed to anthropomorphize an entire marketing team. Blog Agent. Email Agent. Design Agent. Audience Agent. Help Agent. The whole gang. The idea was that, as a user, you were supposed to understand that these were almost like distinct employees, with their own context, memories, and strengths, and use them accordingly. The goal was for users to direct their concerns to the appropriate agent, who could then access and delegate tasks to each other through a single tool that deeply integrated into our homespun framework. As developers, we would maintain lengthy, highly optimized context for each given task, without worry of impeding other workflows or abusing prompt tokens. Users would (hopefully) see benefit in improved accuracy, increased capability, and reduced cost as opposed to having a single, monolithic agent.

And it worked pretty well! Agents were great at targeting each other. The system could handle a straightforward user query that touched multiple parts of the platform. Under the hood, the routing code was a pain to maintain and sometimes agents would hallucinate their identities, but on the whole it was a solid system for the time.

"Do I Talk to the Blog Agent or the Design Agent?"

The problem was that users had no clue how to use the platform.

"I need a blog with a cool cover photo — do I talk to the Blog Agent or the Design Agent?"

"I'm trying to increase my email engagement — is that Audience or Email?"

"I just want to update my DNS settings, how do I do that?"

Users struggled with targeting the right agent for a seemingly simple ask. If you got the entry point wrong, or phrased your request too generically, you'd likely get routed to our Help Agent — which could help you navigate user-facing docs, but had basically zero context about the inner workings of the rest of the platform. The nitty-gritty details that would have actually made it effective were locked away in the other agents' system prompts.

Even if you got the entry point correct, you might see a solid chunk of the the conversation devoted to agent hand offs, as different agents handed each other back and forth control of the thread in order to carry out an end-to-end task. I built the damn routing system, and I would sometimes be dazzled by the paths agents would take. Impressed by the result? Yes, but dazzled nonetheless.

We'd built a technically impressive system that made perfect sense to engineers and almost no sense to the people who actually needed to use it. That's a problem.

The Skills Rewrite: Letting the Model Do What It's Good At

Earlier this year, I sat down to essentially rewrite our agentic offerings through the introduction of skills. I suspected that something like skills would be helpful for our users, having benefited from their introduction into Claude Code. Additionally, they had at this point become something of a mainstay of the industry (even if they had only existed as an open standard for less than two months).

The core idea: let agents dynamically pick and choose their system message context via a tool call in response to user queries. That crazy thing we didn't think LLMs could handle properly when we first started building systems of substance? Turns out, they can handle it just fine now.

Instead of having a Blog Agent, an Email Agent, an Audience Agent, a Design Agent, and so on — we'd have a single agent with a series of skills that:

More or less implemented the same workflows as their legacy agent counterparts
Could be activated via a tool call when the agent determined it needed a given capability
Injected relevant context into a dedicated section of the agent's system message upon activation
Activated the appropriate tools for whatever workflow the skill was designed to cover

The rewrite itself wasn't even that involved. Once I landed on the correct way to record and manage skill activations, the translation of agents-into-skills-unified-under-a-single-agent was surprisingly mechanical. I wrote a skill for Claude Code describing the formula and let it chug away. The patterns were consistent enough that most of the migration was straightforward transformation rather than creative problem-solving.

What Skills Actually Look Like

A skill in Quotient is essentially a self-contained capability module. When activated, it provides:

Context — domain-specific instructions, guidelines, and schema knowledge injected into the system prompt
Tools — the specific function calls relevant to that capability (e.g., blog creation tools, email editing tools, audience segmentation tools)
Workflow knowledge — how to orchestrate multi-step operations within that domain

The agent starts each conversation with access to a lightweight set of base capabilities and a catalog of available skills. When a user asks something that requires a specific capability, the agent activates the appropriate skill — no routing logic, no hand-off, no user decision required. The user says "write me a blog post with a cover image" and the agent just… activates the blog skill and the design skill, and gets to work.

This is the part that would have been unthinkable in 2024. Trusting a model to read a catalog of ~15 skills, accurately assess which ones it needs, activate them, absorb the new context, and proceed with the task — all in a single conversation? That would have been reckless back then. Today it's reliable enough that our users don't even think about it. Which is exactly the point.

The Results: Composable, Simple, Extensible

The end result didn't fundamentally change what our agent framework was capable of. The same workflows exist. The same tools are available. But the experience of using the platform changed dramatically — even for our CEO, who was already well-versed in wrangling multiple agents to achieve his marketing goals:

Three things stand out:

Composable. The agent can activate multiple skills simultaneously. Need to write a blog post, generate a cover image, and schedule it as part of a campaign? That's three skills working together in one conversation. No hand-offs, no context switching, no lost state between agents.

Simple to grasp. No more complex routing or pathfinding for the user to puzzle over. The user just sees work getting done. And frankly, that's all they care about. The infrastructure underneath is invisible, which is how it should be.

Extensible. This is the one I'm most excited about. Users can write their own skills to craft custom workflows on top of what we provide. The skill abstraction creates a clean extension point that doesn't require understanding the full agent framework — just define your context, your tools, and your workflow, and the agent handles the rest.

What's Next

Skills are live on Quotient now and I'm genuinely stoked about where we go from here. The skills rewrite got us up to speed with where the industry has landed on agentic architecture — but it also opens up a clear roadmap for what comes next.

The things I'm most excited about:

External MCP support — Anthropic open-sourced the Model Context Protocol in November 2024, and it’s since become the de facto standard for connecting agents to tools and data. MCP replaced the old N×M integration problem — where every agent needed custom connectors for every data source — with a single universal protocol. We’re building out support for letting users connect external MCP servers to Quotient so the agent can interface with any tool in their stack, not just ours.
Exposing Quotient as an MCP server — the flip side of consuming MCP is being an MCP server. With thousands of community-built MCP servers already in the wild and SDKs available for every major language, MCP has become the connectivity layer that agent ecosystems build on. We want to bring our entire marketing stack — blog authoring, email campaigns, audience segmentation, social scheduling — to agents everywhere, not just the ones running inside our platform.
Agent-to-agent (A2A) support — Google launched the Agent2Agent protocol in April 2025 with over 50 technology partners including Atlassian, Salesforce, SAP, and LangChain. Where MCP standardizes agent-to-tool communication, A2A standardizes agent-to-agent communication — letting agents discover each other, negotiate capabilities, and collaborate on tasks as peers rather than subordinates. This is not multi-agent routing as a band-aid for ad hoc context management (which is what our old system was often doing) — it’s a first-class mechanism for interacting between genuinely distinct systems at the level the pattern actually deserves. Now Quotient has a single well-defined entry point through which to integrate with the rest of the agentic world, and it’s super exciting to think about how users might start leveraging that!

As an industry, we went from hand-rolling XML parsers for tool calls to an architecture where a single agent can dynamically assemble its own capabilities on the fly, in response to natural language, reliably. The rate of change in this space is astonishing and I can’t wait to see where the next rewrite takes us!

A Note on How This Post Was Made

This post was drafted using Quotient's own AI agent, working from a set of raw notes I wrote. The structure, prose, and voice were shaped by the Opus 4.6 working within our Agent Framework, followed by targeted edits by yours truly!

Here are the original notes, unedited:

Back in the day (early 2024) you had to use a specialized model in order to get something as basic as tool calling. Additionally you might spend effort writing a harness around your model choice in order to implement things like tool execution and specialized rag.
while the new technology was shocking and cool, it required alot of handholding and novel problem solving (alot of which got thrown in the trash as dedicated teams came out with their own harnesses, agent frameworks, rag pipelines, tool calling capabilities).
all of that is to say, when it came to getting agents in order to actually do things, devs were in a poor position
model performance was comparably poor to what it is today, and less well-trained
in order to get appreciative performance, people usually put alot of effort into - prompt writing and prompt re-writing (we had not yet started talking about Context engineering ) - fine tuning (which, idk people don’t really DO, or at least don’t consider as a useful solution until you have a lot more time and resources to do it, usually you can just write more context and spend more tokens)
Even then, this process would only cover optimization for a single task or workflow (which the industry then started to decide calling agents). You might need multiple agents in order to get anything interesting to be done by all the tokens you got the models to spit out and build anything of appreciable substance for an end user, outside a generic chat application.
Generally we were left with a problem:
- in order to do anything cool we need to dynamically switch up system prompts in reference to user queries
But we didn’t believe the models were good enough to be able to figure out how or when to do that by themselves — for Christs sake it took hours of my own dev time just to get OpenHermes to spit out some XML I could parse as a tool call, and then it would often falter I loaded ~10 tools. Tools were precious precious context
In the long run we were wrong about this! That’s what this article is about. But at the time, it was all turning up “write a fuck ton of agents”
There were a few classes of solutions people tended to play with:
explicit agent calls — put the onus on the user to direct queries to agents (no routing)
Implement routing based on pre-embedding a message, or using key word search in order to determine which agent would respond to a query or question (sentimental routing)
Delegate routing to agents, ie agents decide when they need to delegate to another agent, and call a specific tool for that
NOTE: there was also the whole agents as DAGs paradigm going on, but dude, that’s a workflow, not an agent. This is not what people meant by agents. Now as models have gotten even better, I don’t see many people talking about this still.
If you had used Quotient in 2025, you will be familiar with the last approach. As a team we divide up and maintained a suite of agents which were supposed to antrhpmorphize a whole marketing team.
It honestly, worked very well. Agents were great at targeting each other. The routing code was kinda of a pain to maintain and very complicated, and sometimes they hallucinated their identities, but generally a stragithfroward user query handling multiple portions of the platform.
the problem was that users had no fucking clue how to use the platform.
“I need a blog with a cool cover photo, do I talk to the Blog agent or the Design Agent?”
“I’m wondering how to increase my engagement over email, is that Audience or Email?”
“I just wanna fucking update my DNS settings how do I do that?”
Users struggled both with targeting specific agents and making sense of the web of hand offs and agents which might cascade from a seemingly straightforward ask. Moreover, if you got the entry point wrong, or phrased your ask to generically, you might get sent to our (kinda throwaway) help agent, which could help you read user facing docs, but kinda had fuck all context about the inner workings of the rest of the platform, ie the nitty gritty details which might actually help it be effective.”
early this year I sat down to essentially do a rewrite of our agentic offerings through the introduction of skills, basically letting agents dynamically pick and choose their system message context with a tool call in reference to user queries — right that crazy thing we didn’t think the LLMs could do properly when we first tried building systems of substance! instead of havbing a blog agent, and an email agent, and an audience agent, and a -- we would have a series of skills that
more or less implemented the same workflow as their legacy agent
could be activated in a tool call
dumped relevant context in a set portion of our agents system message
activated tools for whatever workflow the skill was meant to cover
the rewrite wasn't that actually that involved, since once i landed on the correct way to record the activation of skills, the translation of agents into skills unified under a single agent was as simple as .... writing a skill for claude code on the formula for doing this and letting it chug away!
the end result did not fundamentally change the capabilities of our agent framrwork, but it made the agents noticeably easier to use, even for our ceo who was otherwise well versed in wranglign multiple agetns in order to achieve his marketing goals.
additionally they were
composable -- our agent can activate and get access to multiple capabilities at once!
simple to grasp -- no more complex routing and path finding, the user just sees work getting done, and frankly thats all they care about
extensible -- users can write their own skills in order to craft their own worflows, in addition to what we provide them
Skills are now like on Quotient getquotient.ai and you can start using them in order to knock out your marketing wrokflows with ease. Im super stoked about getting our own framework further up to spec with the rest of the industry, with more externa MCP support (you integrating external mcp servers into quotient), implemetnign our own mcp servers to bring our marketing stack to agents everywhere), and exposing quotient over a2a for complex agent to agent use cases at the leve they desever (not as a solution for doing ad ghoc context management, but itnerfacting between distinct systems)