226 lines
10 KiB
Markdown
226 lines
10 KiB
Markdown
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations under the License.
|
|
|
|
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
|
rendered properly in your Markdown viewer.
|
|
|
|
-->
|
|
# Tools
|
|
|
|
[[open-in-colab]]
|
|
|
|
Here, we're going to see advanced tool usage.
|
|
|
|
> [!TIP]
|
|
> If you're new to building agents, make sure to first read the [intro to agents](./intro_agents) and the [guided tour of smolagents](../guided_tour).
|
|
|
|
### Directly define a tool by subclassing Tool
|
|
|
|
Let's take again the tool example from the [quicktour](../quicktour), for which we had implemented a `@tool` decorator. The `tool` decorator is the standard format, but sometimes you need more: use several methods in a class for more clarity, or using additional class attributes.
|
|
|
|
In this case, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
|
|
|
|
The custom tool needs:
|
|
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
|
|
- An attribute `description` is used to populate the agent's system prompt.
|
|
- An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
|
|
- An `output_type` attribute, which specifies the output type.
|
|
- A `forward` method which contains the inference code to be executed.
|
|
|
|
The types for both `inputs` and `output_type` should be amongst [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: [`~AUTHORIZED_TYPES`].
|
|
|
|
Also, all imports should be put within the tool's forward function, else you will get an error.
|
|
|
|
```python
|
|
from smolagents import Tool
|
|
|
|
class HFModelDownloadsTool(Tool):
|
|
name = "model_download_counter"
|
|
description = """
|
|
This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
|
|
It returns the name of the checkpoint."""
|
|
|
|
inputs = {
|
|
"task": {
|
|
"type": "string",
|
|
"description": "the task category (such as text-classification, depth-estimation, etc)",
|
|
}
|
|
}
|
|
output_type = "string"
|
|
|
|
def forward(self, task: str):
|
|
from huggingface_hub import list_models
|
|
|
|
model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
|
|
return model.id
|
|
tool = HFModelDownloadsTool()
|
|
```
|
|
|
|
Now the custom `HfModelDownloadsTool` class is ready.
|
|
|
|
### Share your tool to the Hub
|
|
|
|
You can also share your custom tool to the Hub by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.
|
|
|
|
```python
|
|
tool.push_to_hub("{your_username}/hf-model-downloads", token="<YOUR_HUGGINGFACEHUB_API_TOKEN>")
|
|
```
|
|
|
|
For the push to Hub to work, your tool will need to respect some rules:
|
|
- All method are self-contained, e.g. use variables that come either from their args,
|
|
- If you subclass the `__init__` method, you can give it no other argument than `self`. This is because arguments set during a specific tool instance's initialization are hard to track, which prevents from sharing them properly to the hub. And anyway, the idea of making a specific class is that you can already set class attributes for anything you need to hard-code (just set `your_variable=(...)` directly under the `class YourTool(Tool):` line). And of course you can still create a class attribute anywhere in your code by assigning stuff to `self.your_variable`.
|
|
|
|
Once your tool is pushed to Hub, you can load it with the [`~Tool.load_tool`] function and pass it to the `tools` parameter in your agent.
|
|
Since running tools means running custom code, you need to make sure you trust the repository, and pass `trust_remote_code=True`.
|
|
|
|
```python
|
|
from smolagents import load_tool, CodeAgent
|
|
|
|
model_download_tool = load_tool(
|
|
"{your_username}/hf-model-downloads",
|
|
trust_remote_code=True
|
|
)
|
|
```
|
|
|
|
### Import a Space as a tool
|
|
|
|
You can directly import a Space from the Hub as a tool using the [`Tool.from_space`] method!
|
|
|
|
You only need to provide the id of the Space on the Hub, its name, and a description that will help you agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.
|
|
|
|
For instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.
|
|
|
|
```python
|
|
image_generation_tool = Tool.from_space(
|
|
"black-forest-labs/FLUX.1-schnell",
|
|
name="image_generator",
|
|
description="Generate an image from a prompt"
|
|
)
|
|
|
|
image_generation_tool("A sunny beach")
|
|
```
|
|
And voilà, here's your image! 🏖️
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp">
|
|
|
|
Then you can use this tool just like any other tool. For example, let's improve the prompt `a rabbit wearing a space suit` and generate an image of it.
|
|
|
|
```python
|
|
from smolagents import CodeAgent, HfApiEngine
|
|
|
|
llm_engine = HfApiEngine("Qwen/Qwen2.5-Coder-32B-Instruct")
|
|
agent = CodeAgent(tools=[image_generation_tool], llm_engine=llm_engine)
|
|
|
|
agent.run(
|
|
"Improve this prompt, then generate an image of it.", prompt='A rabbit wearing a space suit'
|
|
)
|
|
```
|
|
|
|
```text
|
|
=== Agent thoughts:
|
|
improved_prompt could be "A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background"
|
|
|
|
Now that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.
|
|
>>> Agent is executing the code below:
|
|
image = image_generator(prompt="A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background")
|
|
final_answer(image)
|
|
```
|
|
|
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp">
|
|
|
|
How cool is this? 🤩
|
|
|
|
### Use gradio-tools
|
|
|
|
[gradio-tools](https://github.com/freddyaboulton/gradio-tools) is a powerful library that allows using Hugging
|
|
Face Spaces as tools. It supports many existing Spaces as well as custom Spaces.
|
|
|
|
Transformers supports `gradio_tools` with the [`Tool.from_gradio`] method. For example, let's use the [`StableDiffusionPromptGeneratorTool`](https://github.com/freddyaboulton/gradio-tools/blob/main/gradio_tools/tools/prompt_generator.py) from `gradio-tools` toolkit for improving prompts to generate better images.
|
|
|
|
Import and instantiate the tool, then pass it to the `Tool.from_gradio` method:
|
|
|
|
```python
|
|
from gradio_tools import StableDiffusionPromptGeneratorTool
|
|
|
|
gradio_prompt_generator_tool = StableDiffusionPromptGeneratorTool()
|
|
prompt_generator_tool = Tool.from_gradio(gradio_prompt_generator_tool)
|
|
```
|
|
|
|
> [!WARNING]
|
|
> gradio-tools require *textual* inputs and outputs even when working with different modalities like image and audio objects. Image and audio inputs and outputs are currently incompatible.
|
|
|
|
### Use LangChain tools
|
|
|
|
We love Langchain and think it has a very compelling suite of tools.
|
|
To import a tool from LangChain, use the `from_langchain()` method.
|
|
|
|
Here is how you can use it to recreate the intro's search result using a LangChain web search tool.
|
|
This tool will need `pip install langchain google-search-results -q` to work properly.
|
|
```python
|
|
from langchain.agents import load_tools
|
|
|
|
search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
|
|
|
|
agent = CodeAgent(tools=[search_tool], llm_engine=llm_engine)
|
|
|
|
agent.run("How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?")
|
|
```
|
|
|
|
### Manage your agent's toolbox
|
|
|
|
You can manage an agent's toolbox by adding or replacing a tool.
|
|
|
|
Let's add the `model_download_tool` to an existing agent initialized with only the default toolbox.
|
|
|
|
```python
|
|
from smolagents import HfApiEngine
|
|
|
|
llm_engine = HfApiEngine("Qwen/Qwen2.5-Coder-32B-Instruct")
|
|
|
|
agent = CodeAgent(tools=[], llm_engine=llm_engine, add_base_tools=True)
|
|
agent.toolbox.add_tool(model_download_tool)
|
|
```
|
|
Now we can leverage the new tool:
|
|
|
|
```python
|
|
agent.run(
|
|
"Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?"
|
|
)
|
|
```
|
|
|
|
|
|
> [!TIP]
|
|
> Beware of not adding too many tools to an agent: this can overwhelm weaker LLM engines.
|
|
|
|
|
|
Use the `agent.toolbox.update_tool()` method to replace an existing tool in the agent's toolbox.
|
|
This is useful if your new tool is a one-to-one replacement of the existing tool because the agent already knows how to perform that specific task.
|
|
Just make sure the new tool follows the same API as the replaced tool or adapt the system prompt template to ensure all examples using the replaced tool are updated.
|
|
|
|
|
|
### Use a collection of tools
|
|
|
|
You can leverage tool collections by using the ToolCollection object, with the slug of the collection you want to use.
|
|
Then pass them as a list to initialize you agent, and start using them!
|
|
|
|
```py
|
|
from transformers import ToolCollection, CodeAgent
|
|
|
|
image_tool_collection = ToolCollection(
|
|
collection_slug="huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f",
|
|
token="<YOUR_HUGGINGFACEHUB_API_TOKEN>"
|
|
)
|
|
agent = CodeAgent(tools=[*image_tool_collection.tools], llm_engine=llm_engine, add_base_tools=True)
|
|
|
|
agent.run("Please draw me a picture of rivers and lakes.")
|
|
```
|
|
|
|
To speed up the start, tools are loaded only if called by the agent. |