# Agents
Smolagents is an experimental API which is subject to change at any time. Results returned by the agents
can vary as the APIs or underlying models are prone to change.
To learn more about agents and tools make sure to read the [introductory guide](../index). This page
contains the API docs for the underlying classes.
## Agents
Our agents inherit from [`MultiStepAgent`], which means they can act in multiple steps, each step consisting of one thought, then one tool call and execution. Read more in [this conceptual guide](../conceptual_guides/react).
We provide two types of agents, based on the main [`Agent`] class.
- [`CodeAgent`] is the default agent, it writes its tool calls in Python code.
- [`ToolCallingAgent`] writes its tool calls in JSON.
Both require arguments `model` and list of tools `tools` at initialization.
### Classes of agents
[[autodoc]] MultiStepAgent
[[autodoc]] CodeAgent
[[autodoc]] ToolCallingAgent
### ManagedAgent
_This class is deprecated since 1.8.0: now you just need to pass name and description attributes to an agent to directly use it as previously done with a ManagedAgent._
### stream_to_gradio
[[autodoc]] stream_to_gradio
### GradioUI
> [!TIP]
> You must have `gradio` installed to use the UI. Please run `pip install smolagents[gradio]` if it's not the case.
[[autodoc]] GradioUI
## Models
You're free to create and use your own models to power your agent.
You could use any `model` callable for your agent, as long as:
1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`.
2. It stops generating outputs *before* the sequences passed in the argument `stop_sequences`
For defining your LLM, you can make a `custom_model` method which accepts a list of [messages](./chat_templating) and returns text. This callable also needs to accept a `stop_sequences` argument that indicates when to stop generating.
```python
from huggingface_hub import login, InferenceClient
login("")
model_id = "meta-llama/Llama-3.3-70B-Instruct"
client = InferenceClient(model=model_id)
def custom_model(messages, stop_sequences=["Task"]) -> str:
response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
answer = response.choices[0].message.content
return answer
```
Additionally, `custom_model` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
### TransformersModel
For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization.
```python
from smolagents import TransformersModel
model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
```
```text
>>> What a
```
> [!TIP]
> You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case.
[[autodoc]] TransformersModel
### HfApiModel
The `HfApiModel` wraps an [HF Inference API](https://huggingface.co/docs/api-inference/index) client for the execution of the LLM.
```python
from smolagents import HfApiModel
messages = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "No need to help, take it easy."},
]
model = HfApiModel()
print(model(messages))
```
```text
>>> Of course! If you change your mind, feel free to reach out. Take care!
```
[[autodoc]] HfApiModel
### LiteLLMModel
The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers.
You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`.
```python
from smolagents import LiteLLMModel
messages = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "No need to help, take it easy."},
]
model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10)
print(model(messages))
```
[[autodoc]] LiteLLMModel
## Prompts
[[autodoc]] smolagents.agents.PromptTemplates
[[autodoc]] smolagents.agents.PlanningPromptTemplate
[[autodoc]] smolagents.agents.ManagedAgentPromptTemplate
[[autodoc]] smolagents.agents.FinalAnswerPromptTemplate