OpenAI's Swarm: A Deep Dive into Multi-Agent Orchestration for Everyone

Thursday, October 31, 2024 by sanchayt743

OpenAI's Swarm: A Deep Dive into Multi-Agent Orchestration for Everyone

OpenAI's Swarm is a groundbreaking framework that simplifies the orchestration of multi-agent systems. It introduces advanced concepts like agents, handoffs, routines, and function calling, providing a powerful tool for experimenting with the coordination of multiple AI agents. While Swarm is designed primarily for educational and experimental purposes, its features give valuable insights into the future of AI coordination and autonomous workflows. This tutorial dives into these key features, illustrating their unique aspects, use cases, and contributions to efficient multi-agent orchestration. If you want to learn more about Swarm, check out this blog where we cover in detail the potential and impact of this framework.

Here’s an example of creating a basic agent in Swarm:

from swarm import Swarm, Agent

client = Swarm()

agent = Agent(
    name="Basic Agent",
    instructions="You are an assistant that provides useful information."
)

messages = [{"role": "user", "content": "Hello, can you help me?"}]
response = client.run(agent=agent, messages=messages)

print(response.messages[-1]["content"])

In this example, the Basic Agent is designed to respond to a user's request. The instructions define the agent's role, and the agent autonomously handles the incoming message to provide a helpful response.

Here's a more advanced example where agents collaborate:

from swarm import Swarm, Agent

client = Swarm()

assistant_agent = Agent(
    name="Assistant Agent",
    instructions="You are a helpful assistant. If the user asks about the weather, transfer the query to the Weather Agent."
)

weather_agent = Agent(
    name="Weather Agent",
    instructions="You provide the current weather details for any city."
)

def transfer_to_weather_agent():
    """Transfers control to the Weather Agent."""
    return weather_agent

assistant_agent.functions.append(transfer_to_weather_agent)

# Example interaction
messages = [{"role": "user", "content": "What's the weather like in Paris?"}]
response = client.run(agent=assistant_agent, messages=messages)

print(response.messages[-1]["content"])

In this case, the Assistant Agent determines if the query is related to the weather and hands off the conversation to the Weather Agent, demonstrating a more collaborative approach between agents. In this example, the Basic Agent is designed to respond to a user's request. The instructions define the agent's role, and the agent autonomously handles the incoming message to provide a helpful response.

Handoffs

Handoffs are a core mechanism in Swarm that allows one agent to transfer the control of a task to another agent better suited for that part of the workflow. This functionality is similar to how in a customer service environment, a representative might pass your call to another department to address your specific needs. Handoffs in Swarm are seamless, enabling efficient task delegation among agents.

For example, you can use handoffs when a customer initially interacts with a general support agent but later requires specialized assistance. Instead of a single agent attempting to handle everything, the task is handed off to another agent that is more skilled in handling specific inquiries:

from swarm import Swarm, Agent

client = Swarm()

# Creating two agents
general_agent = Agent(
    name="General Agent",
    instructions="You are a general assistant. If the question is technical, transfer it to the Technical Agent."
)

technical_agent = Agent(
    name="Technical Agent",
    instructions="You are a technical expert. Answer all technical questions in detail."
)

def transfer_to_technical_agent():
    """Transfers control to the Technical Agent."""
    return technical_agent

general_agent.functions.append(transfer_to_technical_agent)

# Example interaction
messages = [{"role": "user", "content": "I need help with a technical issue."}]
response = client.run(agent=general_agent, messages=messages)

print(response.messages[-1]["content"])

In this scenario, the General Agent transfers the user's query to the Technical Agent if it determines that specialized assistance is required. This modular delegation of tasks ensures that each agent focuses on its strengths, leading to better efficiency and higher-quality interactions.

Here's another example that involves multiple specialized agents:

from swarm import Swarm, Agent

client = Swarm()

billing_agent = Agent(
    name="Billing Agent",
    instructions="You handle all billing-related inquiries."
)

support_agent = Agent(
    name="Support Agent",
    instructions="You handle technical support issues."
)

general_agent = Agent(
    name="General Agent",
    instructions="You are a general assistant. Transfer the conversation to Billing Agent or Support Agent based on the user's needs."
)

def transfer_to_billing_agent():
    """Transfers control to the Billing Agent."""
    return billing_agent

def transfer_to_support_agent():
    """Transfers control to the Support Agent."""
    return support_agent

general_agent.functions.extend([transfer_to_billing_agent, transfer_to_support_agent])

# Example interaction
messages = [{"role": "user", "content": "I have a billing question."}]
response = client.run(agent=general_agent, messages=messages)

print(response.messages[-1]["content"])

In this extended example, the General Agent decides whether to transfer the user's request to the Billing Agent or the Support Agent. This ensures that the user gets specialized help, enhancing the overall experience. In this scenario, the General Agent transfers the user's query to the Technical Agent if it determines that specialized assistance is required. This modular delegation of tasks ensures that each agent focuses on its strengths, leading to better efficiency and higher-quality interactions.

Routines

Routines in Swarm are structured sets of steps that agents follow to accomplish a task. They can be thought of as checklists or sequences of instructions that ensure the correct execution of a multi-step process. Routines are essential for handling complex workflows, where an agent needs to perform several actions in a specific order to complete a task.

For instance, a Sales Agent might follow a routine when trying to sell a product. The routine would guide the agent through gathering customer information, describing the product’s benefits, handling objections, and closing the sale. Here’s an example of how routines are used in Swarm:

from swarm import Swarm, Agent

client = Swarm()

sales_agent = Agent(
    name="Sales Agent",
    instructions="""Be enthusiastic about selling honey.
    1. Greet the customer and ask for their name.
    2. Find out if they have any specific concerns.
    3. Explain the benefits of honey.
    4. Offer a special discount.
    5. Close the sale and thank the customer.
    """
)

messages = [{"role": "user", "content": "I'm interested in buying some honey."}]
response = client.run(agent=sales_agent, messages=messages)

print(response.messages[-1]["content"])

In this example, the Sales Agent follows a routine to interact with the customer effectively. By providing a clear set of steps, routines help agents remain consistent and thorough, ensuring that no part of the interaction is missed or mishandled.

Routines also help with debugging and improving workflows. Since each action is explicitly defined, developers can easily identify which step in a routine might be causing issues and optimize it for better performance.

Here's an additional example using a customer support routine:

from swarm import Swarm, Agent

client = Swarm()

support_agent = Agent(
    name="Support Agent",
    instructions="""Follow these steps to assist the customer:
    1. Greet the customer politely.
    2. Ask about their issue.
    3. Provide troubleshooting steps based on the issue.
    4. If unresolved, escalate the issue to the Technical Agent.
    5. Thank the customer for their patience.
    """
)

def escalate_to_technical_agent():
    """Escalates the issue to the Technical Agent."""
    return technical_agent

support_agent.functions.append(escalate_to_technical_agent)

messages = [{"role": "user", "content": "I'm facing an issue with my account."}]
response = client.run(agent=support_agent, messages=messages)

print(response.messages[-1]["content"])

In this example, the Support Agent follows a structured routine to assist the customer. If the issue cannot be resolved, the routine includes a step to escalate the problem to the Technical Agent, ensuring that the customer gets the assistance they need. In this example, the Sales Agent follows a routine to interact with the customer effectively. By providing a clear set of steps, routines help agents remain consistent and thorough, ensuring that no part of the interaction is missed or mishandled.

Function Calling

Function Calling is another powerful feature of Swarm that allows agents to execute specific functions to enrich their interactions. By integrating functions, agents can perform dynamic tasks such as retrieving information, processing data, or interacting with external APIs.

Here’s an example of using function calling in Swarm:

from swarm import Swarm, Agent

client = Swarm()

def get_weather(location) -> str:
    return "{'temp':67, 'unit':'F'}"

agent = Agent(
    name="Weather Agent",
    instructions="You are a helpful agent.",
    functions=[get_weather],
)

messages = [{"role": "user", "content": "What's the weather in Paris?"}]
response = client.run(agent=agent, messages=messages)
print(response.messages[-1]["content"])

In this example, the Weather Agent uses a custom function to provide the current weather. This allows for more complex and informative responses.

Simple Loop Interaction

Swarm also supports interactive loops, enabling agents to handle continuous user inputs. This is useful for real-time chat applications where agents respond to a series of questions or commands.

Here’s an example of a simple loop with no helper functions:

from swarm import Swarm, Agent

client = Swarm()

my_agent = Agent(
    name="Loop Agent",
    instructions="You are a helpful agent."
)

def pretty_print_messages(messages):
    for message in messages:
        if message["content"] is None:
            continue
        print(f"{message['sender']}: {message['content']}")

messages = []
agent = my_agent
while True:
    user_input = input("> ")
    if user_input.lower() == "/exit":
        print("Exiting the loop. Goodbye!")
        break
    messages.append({"role": "user", "content": user_input})

    response = client.run(agent=agent, messages=messages)
    messages = response.messages
    agent = response.agent
    pretty_print_messages(messages)

This loop allows users to continuously interact with the Loop Agent, simulating a real-world chatbot where users can ask multiple questions until they choose to exit.

Complex Agent with Multiple Functions

Agents in Swarm can also have multiple functions to handle different user requests. These functions enable agents to perform a variety of actions, enhancing their capabilities.

Here’s an example of a Weather Agent that can not only provide weather details but also send emails:

import json
from swarm import Swarm, Agent

def get_weather(location, time="now"):
    return json.dumps({"location": location, "temperature": "65", "time": time})

def send_email(recipient, subject, body):
    print("Sending email...")
    print(f"To: {recipient}")
    print(f"Subject: {subject}")
    print(f"Body: {body}")
    return "Sent!"

weather_agent = Agent(
    name="Weather Agent",
    instructions="You are a helpful agent.",
    functions=[get_weather, send_email],
)

messages = [{"role": "user", "content": "What's the weather in New York?"}]
response = client.run(agent=weather_agent, messages=messages)
print(response.messages[-1]["content"])

In this example, the Weather Agent can provide weather information and send emails. This versatility makes agents much more capable and adaptable to different scenarios.

Advanced Loop with Function Calling and Streaming

For more advanced interactions, Swarm supports running agents in a loop with streaming responses and context variables. This allows agents to continuously adapt to user input and respond dynamically.

Here’s an advanced demo loop using a Weather Agent:

from swarm import Swarm
import json

def process_and_print_streaming_response(response):
    content = ""
    last_sender = ""

    for chunk in response:
        if "sender" in chunk:
            last_sender = chunk["sender"]

        if "content" in chunk and chunk["content"] is not None:
            if not content and last_sender:
                print(f"[94m{last_sender}:[0m", end=" ", flush=True)
                last_sender = ""
            print(chunk["content"], end="", flush=True)
            content += chunk["content"]

        if "delim" in chunk and chunk["delim"] == "end" and content:
            print()  # End of response message
            content = ""

        if "response" in chunk:
            return chunk["response"]

def run_demo_loop(starting_agent, context_variables=None, stream=False, debug=False):
    client = Swarm()
    print("Starting Swarm CLI 🐝")

    messages = []
    agent = starting_agent

    while True:
        user_input = input("User Input: ")
        if user_input.lower() == "/exit":
            print("Exiting the loop. Goodbye!")
            break
        messages.append({"role": "user", "content": user_input})

        response = client.run(
            agent=agent,
            messages=messages,
            context_variables=context_variables or {},
            stream=stream,
            debug=debug,
        )

        if stream:
            response = process_and_print_streaming_response(response)
        else:
            pretty_print_messages(response.messages)

        messages.extend(response.messages)
        agent = response.agent

run_demo_loop(weather_agent, stream=True)

This code allows the Weather Agent to continuously engage with the user, using streaming for real-time responses. It can also adapt to user input with functions like get_weather and send_email.

Triage Agent Example

Swarm also allows for more sophisticated agents like the Triage Agent, which decides which specialized agent should handle a user’s request.

Here’s an example:

from swarm import Agent

def process_refund(item_id, reason="NOT SPECIFIED"):
    print(f"[mock] Refunding item {item_id} because {reason}...")
    return "Success!"

def apply_discount():
    print("[mock] Applying discount...")
    return "Applied discount of 11%"

triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent is best suited to handle the user's request, and transfer the conversation to that agent.",
)
sales_agent = Agent(
    name="Sales Agent",
    instructions="Be super enthusiastic about selling bees.",
)
refunds_agent = Agent(
    name="Refunds Agent",
    instructions="Help the user with a refund. If the reason is that it was too expensive, offer the user a refund code. If they insist, then process the refund.",
    functions=[process_refund, apply_discount],
)

def transfer_back_to_triage():
    return triage_agent

def transfer_to_sales():
    return sales_agent

def transfer_to_refunds():
    return refunds_agent

triage_agent.functions = [transfer_to_sales, transfer_to_refunds]
sales_agent.functions.append(transfer_back_to_triage)
refunds_agent.functions.append(transfer_back_to_triage)

# Running a demo loop with the Triage Agent
from swarm import Swarm

def run_demo_loop(starting_agent, debug=False):
    client = Swarm()
    print("Starting Swarm CLI 🐝")

    messages = []
    agent = starting_agent

    while True:
        user_input = input("User Input: ")
        if user_input.lower() == "/exit":
            print("Exiting the loop. Goodbye!")
            break
        messages.append({"role": "user", "content": user_input})

        response = client.run(
            agent=agent,
            messages=messages,
            debug=debug,
        )

        pretty_print_messages(response.messages)
        messages.extend(response.messages)
        agent = response.agent

run_demo_loop(triage_agent, debug=False)

The Triage Agent is responsible for routing the user’s request to either the Sales Agent or the Refunds Agent, based on their needs. This ensures that the correct agent handles each task, providing a tailored and efficient experience.

Summary of Key Features

Feature	Description
Agents	Modular units designed to handle specific tasks, capable of interacting autonomously.
Handoffs	Transfers control from one agent to another, ensuring tasks are handled by the right expert.
Routines	Step-by-step guides that agents follow to ensure consistency and accuracy in multi-step tasks.
Function Calling	Allows agents to perform specific functions like retrieving data or sending emails.
Loop Interaction	Agents can continuously interact with users, providing real-time assistance.

Client-Side and Stateless Nature

Swarm runs on the client-side and is stateless. This means that, similar to using the Chat Completions API, each interaction stands alone—there’s no long-term memory between tasks. This approach makes Swarm easy to work with and suitable for lightweight experimentation, where keeping track of an ongoing state isn’t crucial.

The stateless nature of Swarm has its advantages and limitations. On the positive side, it keeps interactions simple and lightweight, which is ideal for scenarios where you don’t need to maintain context over time. This makes Swarm particularly well-suited for quick experiments and educational purposes, where the focus is on understanding agent coordination rather than managing complex states. However, this also means that Swarm might not be the best choice for applications that require persistent memory or long-term context, as each interaction is independent.

Getting Started with Swarm

To get started, you need to install Swarm. It’s as simple as running:

pip install git+https://github.com/openai/swarm.git

Once installed, you can create basic agents and start orchestrating task hand-offs right away. Think of this as setting up your own small orchestra, and you get to be the conductor.

After installation, you can experiment with creating different agents and assigning them various tasks. Start with simple workflows—like having one agent greet a user and another agent provide specific information based on the user's request. As you become more comfortable with the framework, you can move on to more complex orchestrations involving multiple agents and intricate handoffs. This hands-on approach will help you grasp the fundamentals of multi-agent orchestration and understand how different agents can work together to solve complex problems.

Example Use Cases

Swarm shines when you need agents to collaborate in a coordinated way. For instance:

Customer Service Workflow: An initial agent could greet customers and then direct them to specialized agents—like sales, support, or refunds—based on their questions. The handoff feature ensures that the customer speaks to the right specialist every time. This modular approach not only improves efficiency but also enhances the customer experience by providing tailored responses from experts in each area.
Personal Shopper: Imagine creating a shopping assistant. One agent suggests items based on preferences, while another handles order placements. They work together to give a seamless shopping experience. This kind of multi-agent interaction can make online shopping more personalized and efficient, as different agents handle different parts of the process, ensuring that each step is optimized for the best user experience.

Another interesting use case could be educational tools. For example, you could create a system where one agent acts as a tutor, providing explanations on a specific topic, while another agent quizzes the user to assess their understanding. This type of interaction can create a dynamic and engaging learning environment, where the roles of teaching and assessment are clearly defined and handled by specialized agents.

Swarm vs. Other Multi-Agent Frameworks

Swarm has alternatives, like AutoGen, LangChain, and CrewAI. Here’s a quick rundown:

Framework	Key Features	Best Use Cases
Swarm	Lightweight, client-side, stateless	Educational purposes, experimenting with multi-agent interactions
AutoGen	Production-ready, multi-agent conversations	Complex real-world tasks, persistent memory, advanced workflows
LangChain	Stateful interactions, large language models	Iterative workflows, scenarios requiring continuity, conversational AI
CrewAI	Role-based agent design, task delegation	Collaborative environments, project management with specialized agent roles

AutoGen by Microsoft is more production-ready, focusing on multi-agent conversational systems and advanced code execution. It's like Swarm but with more sophisticated capabilities for real-world environments. AutoGen is better suited for applications that require persistent memory and more complex interactions, making it a strong candidate for production use cases where Swarm’s simplicity might be a limitation.
LangChain and LangGraph provide tools for building applications using large language models, allowing for more complex, stateful multi-agent interactions. This makes them better suited for iterative and cyclical workflows that need continuity—something Swarm's statelessness doesn’t cater to. LangChain is particularly useful when you need to maintain context over multiple interactions, making it ideal for scenarios like conversational AI, where understanding the history of the conversation is crucial.
CrewAI takes a role-based approach, allowing agents to delegate tasks autonomously within collaborative projects. CrewAI is perfect for structuring workflows involving teams of agents that must manage intricate, role-specific tasks, making it slightly different from Swarm’s modular but independent approach. CrewAI’s focus on role-based delegation makes it well-suited for complex project management and collaborative environments, where different agents need to take on clearly defined roles to achieve a common goal.

Best Practices and Challenges

Using Swarm is a great way to experiment, but you might encounter some challenges. Maintaining a consistent user experience during handoffs, handling errors when multiple agents are involved, and ensuring interactions stay smooth are some potential issues. To handle these challenges effectively, consider implementing logging and error-handling routines to track agent performance and identify issues early. This will help you maintain control over interactions and ensure that any problems are quickly addressed. Best practices like limiting the number of handoffs and ensuring agents have well-defined, focused roles can help mitigate these challenges.

To get the best results with Swarm, it’s important to design your agents with clear and concise roles. Avoid creating agents that try to do too much, as this can lead to confusion and errors. Instead, focus on breaking down tasks into smaller components and assigning each component to a specialized agent. Testing your agents extensively is also crucial—make sure that each handoff is seamless and that the agents can handle unexpected inputs gracefully.

One of the challenges you might face is ensuring that the user experience remains consistent, especially when multiple agents are involved. To address this, consider using routines to guide the flow of interactions and minimize unnecessary switches between agents. Providing fallback mechanisms for unhandled queries can also help maintain a smooth user experience, ensuring that users receive helpful responses even when things don’t go as planned.

Conclusion and Further Exploration

Swarm is an excellent starting point if you're interested in understanding how multi-agent systems work. It gives you a hands-on way to learn about concepts like agent coordination, handoffs, and routines. While it isn’t built for production environments, it opens up many possibilities for exploring the future of AI-driven workflows. Imagine the potential of having multiple specialized agents coordinating perfectly to help solve real-world problems—that’s what Swarm is here to teach us.

If you’re looking to expand your understanding beyond Swarm and explore more advanced multi-agent frameworks, here are some valuable resources:

Developing Intelligent Agents with CrewAI: For a deeper understanding of intelligent agent development using frameworks like CrewAI, check out this detailed guide.
Maximizing AI Potential: To explore the AI/ML API with AI agents and understand how to maximize their potential, take a look at this tutorial.
Leveraging Composio for Multi-Agent AI Applications: If you're interested in advanced multi-agent AI applications, this Composio tutorial provides valuable insights.