smolagents/docs/source/reference/agents.md

4.2 KiB

Agents

Transformers Agents is an experimental API which is subject to change at any time. Results returned by the agents can vary as the APIs or underlying models are prone to change.

To learn more about agents and tools make sure to read the introductory guide. This page contains the API docs for the underlying classes.

Agents

Our agents inherit from [MultiStepAgent], which means they can act in multiple steps, each step consisting of one thought, then one tool call and execution. Read more in this conceptual guide.

We provide two types of agents, based on the main [Agent] class.

  • [CodeAgent] is the default agent, it writes its tool calls in Python code.
  • [ToolCallingAgent] writes its tool calls in JSON.

Classes of agents

autodoc MultiStepAgent

autodoc CodeAgent

autodoc ToolCallingAgent

ManagedAgent

autodoc ManagedAgent

stream_to_gradio

autodoc stream_to_gradio

GradioUI

autodoc GradioUI

Models

You're free to create and use your own models to power your agent.

You could use any model callable for your agent, as long as:

  1. It follows the messages format (List[Dict[str, str]]) for its input messages, and it returns a str.
  2. It stops generating outputs before the sequences passed in the argument stop_sequences

For defining your LLM, you can make a custom_model method which accepts a list of messages and returns text. This callable also needs to accept a stop_sequences argument that indicates when to stop generating.

from huggingface_hub import login, InferenceClient

login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")

model_id = "meta-llama/Llama-3.3-70B-Instruct"

client = InferenceClient(model=model_id)

def custom_model(messages, stop_sequences=["Task"]) -> str:
    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
    answer = response.choices[0].message.content
    return answer

Additionally, custom_model can also take a grammar argument. In the case where you specify a grammar upon agent initialization, this argument will be passed to the calls to model, with the grammar that you defined upon initialization, to allow constrained generation in order to force properly-formatted agent outputs.

TransformersModel

For convenience, we have added a TransformersModel that implements the points above by building a local transformers pipeline for the model_id given at initialization.

from smolagents import TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")

print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
>>> What a

autodoc TransformersModel

HfApiModel

The HfApiModel is an engine that wraps an HF Inference API client for the execution of the LLM.

from smolagents import HfApiModel

messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
]

model = HfApiModel()
print(model(messages))
>>> Of course! If you change your mind, feel free to reach out. Take care!

autodoc HfApiModel