Prompt Inversion > Blog > Choosing the Right Agentic Framework I

March 17, 2025

Choosing the Right Agentic Framework I

We dissect six of the most popular agentic frameworks: LangChain’s LangGraph, Microsoft’s AutoGen, Pydantic’s PydanticAI, CrewAI, OpenAI’s Swarm, and Hugging Face’s Smolagents

AI Agents are having their breakthrough moment. At their core, agents are systems that combine models, tools, and human input to solve complex tasks autonomously. From OpenAI’s Operator and Deep Research and Google’s AI co-scientist, the biggest AI players kicked off 2025 by betting huge on agentic technology. This rise of agents-as-products represents one of the most practical and profitable use cases for AI.

A wave of new frameworks for building agentic workflows has emerged, each with its own strengths and trade-offs. With so many options, choosing the right one can be tricky. In this two-part blog post we dissect six of the most popular agentic frameworks—LangChain’s LangGraph, Microsoft’s AutoGen, Pydantic’s PydanticAI, CrewAI, OpenAI’s Swarm, and Hugging Face’s Smolgents—across five factors—message passing, state management, tool calling, quality of documentation and ease of use. In this first part, we’ll walk through the setup of our agentic workflow and define some key terminology that will guide our framework comparison.

To have a fair comparison, we implemented the same multi-agent spam classification system across all of six frameworks. The system integrates a fine-tuned BERT spam classifier, GPT-4 for independent reasoning and a human feedback loop for retraining BERT. Here’s how our multi-agent workflow operates:

Multi-Agent Spam Classification Workflow

 

1. Input Agent: Captures user input that contains the message to be classified and passes it to the BERT Agent.

2. BERT Agent: Uses a fine-tuned BERT model to predict whether the input message is spam or not. Passes the input message and the prediction to the GPT Agent.

3. GPT Agent: Evaluates BERT’s prediction using GPT-4, providing agreement or disagreement with a brief explanation.

  1. If GPT agrees with BERT’s prediction, then the message, BERT’S prediction and GPT’s explanation are passed to the final Output Agent.
  2. If GPT disagrees, then the message, BERT’S prediction and GPT’s explanation are passed to the Human Feedback Agent, initiating a retraining loop.

4. Human Feedback Agent: Prompts the human-in-the-loop to provide the correct classification label for the message based on their judgment and passes it to the Retrain Agent.

5. Retrain Agent: Retrains BERT with the new classification label obtained from human feedback and passes the message back to the BERT Agent for a fresh prediction.

6. Output Agent: Presents the final prediction, GPT’s reasoning, and details on human feedback and retraining to the user.

This system represents a self-improving workflow where GPT-4’s reasoning and human feedback refine BERT’s predictions over time. It represents a complex, multi-agent setup where each agent handles a specific task while collaborating with the other agents through precise message passing, state management and tool calling.

Defining some key agentic terminology:

Message Passing: This refers to how agents communicate with each other and pass data between them. Frameworks like AutoGen, Swarm, and LangGraph all refer to this as handoffs. Consistent handoff logic is critical to ensure the right data gets to the right agent at the right time.


Example: In our spam classification workflow, the BERT Agent passes its prediction and the input message to the GPT Agent for analysis.

State Management: This is about how the system tracks and updates the data (state) being passed between agents. Different frameworks handle data differently—LangGraph calls it state, PydanticAI refers to it as dependencies, Swarm uses context variables and CrewAI calls it expected output. Effective state management ensures the workflow’s data remains consistent and accessible across agents.

Example: The Output Agent receives all the data from previous agents, including the input message, BERT’s prediction, GPT’s analysis, and conditionally, the human feedback and retraining data.

Tools: Tools are functions that agents can use to call external libraries, APIs, or custom code. They help separate out logic that doesn’t need to be AI-powered, making agent behavior more deterministic and reliable. Example: A custom tool checks if GPT’s response disagrees with BERT’s prediction by searching for the word “disagree” in GPT’s explanation.

In the next part of this blog post, we’ll dive into a detailed comparison of LangGraph, AutoGen, PydanticAI, CrewAI, Swarm, and SmolAgents—evaluating their strengths and weaknesses to determine which framework is best suited for specific use cases.

Recent blog posts

LLMs
Agents

Keeping up with AI Advances

We list some of our favorite sources for AI news.

June 16, 2025
Tanay Wakhare
Read more
Agents

Beyond Chatbots: Real-World Agentic Workflows at PartyBus

We highlight our recent case study implementing agentic workflows for hiring managers

June 9, 2025
Read more
Security

Navigating the EU AI Act

We discuss the European Union AI Act and how it affects businesses

June 2, 2025
Tejas Gopal
Read more