247 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
	
	
			
		
		
	
	
			247 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
	
	
| <!--Copyright 2024 The HuggingFace Team. All rights reserved.
 | |
| 
 | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 | |
| the License. You may obtain a copy of the License at
 | |
| 
 | |
| http://www.apache.org/licenses/LICENSE-2.0
 | |
| 
 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
 | |
| specific language governing permissions and limitations under the License.
 | |
| 
 | |
| ⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
 | |
| rendered properly in your Markdown viewer.
 | |
| 
 | |
| -->
 | |
| # Tools
 | |
| 
 | |
| [[open-in-colab]]
 | |
| 
 | |
| Here, we're going to see advanced tool usage.
 | |
| 
 | |
| > [!TIP]
 | |
| > If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).
 | |
| 
 | |
| - [Tools](#tools)
 | |
|     - [What is a tool, and how to build one?](#what-is-a-tool-and-how-to-build-one)
 | |
|     - [Share your tool to the Hub](#share-your-tool-to-the-hub)
 | |
|     - [Import a Space as a tool](#import-a-space-as-a-tool)
 | |
|     - [Use LangChain tools](#use-langchain-tools)
 | |
|     - [Manage your agent's toolbox](#manage-your-agents-toolbox)
 | |
|     - [Use a collection of tools](#use-a-collection-of-tools)
 | |
| 
 | |
| ### What is a tool, and how to build one?
 | |
| 
 | |
| A tool is mostly a function that an LLM can use in an agentic system.
 | |
| 
 | |
| But to use it, the LLM will need to be given an API: name, tool description, input types and descriptions, output type.
 | |
| 
 | |
| So it cannot be only a function. It should be a class.
 | |
| 
 | |
| So at core, the tool is a class that wraps a function with metadata that helps the LLM understand how to use it.
 | |
| 
 | |
| Here's how it looks:
 | |
| 
 | |
| ```python
 | |
| from smolagents import Tool
 | |
| 
 | |
| class HFModelDownloadsTool(Tool):
 | |
|     name = "model_download_counter"
 | |
|     description = """
 | |
|     This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
 | |
|     It returns the name of the checkpoint."""
 | |
|     inputs = {
 | |
|         "task": {
 | |
|             "type": "string",
 | |
|             "description": "the task category (such as text-classification, depth-estimation, etc)",
 | |
|         }
 | |
|     }
 | |
|     output_type = "string"
 | |
| 
 | |
|     def forward(self, task: str):
 | |
|         from huggingface_hub import list_models
 | |
| 
 | |
|         model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
 | |
|         return model.id
 | |
| 
 | |
| model_downloads_tool = HFModelDownloadsTool()
 | |
| ```
 | |
| 
 | |
| The custom tool subclasses [`Tool`] to inherit useful methods. The child class also defines:
 | |
| - An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
 | |
| - An attribute `description` is used to populate the agent's system prompt.
 | |
| - An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
 | |
| - An `output_type` attribute, which specifies the output type. The types for both `inputs` and `output_type` should be [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: [`~AUTHORIZED_TYPES`].
 | |
| - A `forward` method which contains the inference code to be executed.
 | |
| 
 | |
| And that's all it needs to be used in an agent!
 | |
| 
 | |
| There's another way to build a tool. In the [guided_tour](../guided_tour), we implemented a tool using the `@tool` decorator. The [`tool`] decorator is the recommended way to define simple tools, but sometimes you need more than this: using several methods in a class for more clarity, or using additional class attributes.
 | |
| 
 | |
| In this case, you can build your tool by subclassing [`Tool`] as described above.
 | |
| 
 | |
| ### Share your tool to the Hub
 | |
| 
 | |
| You can share your custom tool to the Hub by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.
 | |
| 
 | |
| ```python
 | |
| model_downloads_tool.push_to_hub("{your_username}/hf-model-downloads", token="<YOUR_HUGGINGFACEHUB_API_TOKEN>")
 | |
| ```
 | |
| 
 | |
| For the push to Hub to work, your tool will need to respect some rules:
 | |
| - All methods are self-contained, e.g. use variables that come either from their args.
 | |
| - As per the above point, **all imports should be defined directly within the tool's functions**, else you will get an error when trying to call [`~Tool.save`] or [`~Tool.push_to_hub`] with your custom tool.
 | |
| - If you subclass the `__init__` method, you can give it no other argument than `self`. This is because arguments set during a specific tool instance's initialization are hard to track, which prevents from sharing them properly to the hub. And anyway, the idea of making a specific class is that you can already set class attributes for anything you need to hard-code (just set `your_variable=(...)` directly under the `class YourTool(Tool):` line). And of course you can still create a class attribute anywhere in your code by assigning stuff to `self.your_variable`.
 | |
| 
 | |
| 
 | |
| Once your tool is pushed to Hub, you can visualize it. [Here](https://huggingface.co/spaces/m-ric/hf-model-downloads) is the `model_downloads_tool` that I've pushed. It has a nice gradio interface.
 | |
| 
 | |
| When diving into the tool files, you can find that all the tool's logic is under [tool.py](https://huggingface.co/spaces/m-ric/hf-model-downloads/blob/main/tool.py). That is where you can inspect a tool shared by someone else.
 | |
| 
 | |
| Then you can load the tool with [`load_tool`] or create it with [`~Tool.from_hub`] and pass it to the `tools` parameter in your agent.
 | |
| Since running tools means running custom code, you need to make sure you trust the repository, thus we require to pass `trust_remote_code=True` to load a tool from the Hub.
 | |
| 
 | |
| ```python
 | |
| from smolagents import load_tool, CodeAgent
 | |
| 
 | |
| model_download_tool = load_tool(
 | |
|     "{your_username}/hf-model-downloads",
 | |
|     trust_remote_code=True
 | |
| )
 | |
| ```
 | |
| 
 | |
| ### Import a Space as a tool
 | |
| 
 | |
| You can directly import a Space from the Hub as a tool using the [`Tool.from_space`] method!
 | |
| 
 | |
| You only need to provide the id of the Space on the Hub, its name, and a description that will help you agent understand what the tool does. Under the hood, this will use [`gradio-client`](https://pypi.org/project/gradio-client/) library to call the Space.
 | |
| 
 | |
| For instance, let's import the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) Space from the Hub and use it to generate an image.
 | |
| 
 | |
| ```python
 | |
| image_generation_tool = Tool.from_space(
 | |
|     "black-forest-labs/FLUX.1-schnell",
 | |
|     name="image_generator",
 | |
|     description="Generate an image from a prompt"
 | |
| )
 | |
| 
 | |
| image_generation_tool("A sunny beach")
 | |
| ```
 | |
| And voilà, here's your image! 🏖️
 | |
| 
 | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/sunny_beach.webp">
 | |
| 
 | |
| Then you can use this tool just like any other tool.  For example, let's improve the prompt  `a rabbit wearing a space suit` and generate an image of it. This example also shows how you can pass additional arguments to the agent.
 | |
| 
 | |
| ```python
 | |
| from smolagents import CodeAgent, HfApiModel
 | |
| 
 | |
| model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct")
 | |
| agent = CodeAgent(tools=[image_generation_tool], model=model)
 | |
| 
 | |
| agent.run(
 | |
|     "Improve this prompt, then generate an image of it.", additional_args={'user_prompt': 'A rabbit wearing a space suit'}
 | |
| )
 | |
| ```
 | |
| 
 | |
| ```text
 | |
| === Agent thoughts:
 | |
| improved_prompt could be "A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background"
 | |
| 
 | |
| Now that I have improved the prompt, I can use the image generator tool to generate an image based on this prompt.
 | |
| >>> Agent is executing the code below:
 | |
| image = image_generator(prompt="A bright blue space suit wearing rabbit, on the surface of the moon, under a bright orange sunset, with the Earth visible in the background")
 | |
| final_answer(image)
 | |
| ```
 | |
| 
 | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit_spacesuit_flux.webp">
 | |
| 
 | |
| How cool is this? 🤩
 | |
| 
 | |
| ### Use LangChain tools
 | |
| 
 | |
| We love Langchain and think it has a very compelling suite of tools.
 | |
| To import a tool from LangChain, use the `from_langchain()` method.
 | |
| 
 | |
| Here is how you can use it to recreate the intro's search result using a LangChain web search tool.
 | |
| This tool will need `pip install langchain google-search-results -q` to work properly.
 | |
| ```python
 | |
| from langchain.agents import load_tools
 | |
| 
 | |
| search_tool = Tool.from_langchain(load_tools(["serpapi"])[0])
 | |
| 
 | |
| agent = CodeAgent(tools=[search_tool], model=model)
 | |
| 
 | |
| agent.run("How many more blocks (also denoted as layers) are in BERT base encoder compared to the encoder from the architecture proposed in Attention is All You Need?")
 | |
| ```
 | |
| 
 | |
| ### Manage your agent's toolbox
 | |
| 
 | |
| You can manage an agent's toolbox by adding or replacing a tool in attribute `agent.tools`, since it is a standard dictionary.
 | |
| 
 | |
| Let's add the `model_download_tool` to an existing agent initialized with only the default toolbox.
 | |
| 
 | |
| ```python
 | |
| from smolagents import HfApiModel
 | |
| 
 | |
| model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct")
 | |
| 
 | |
| agent = CodeAgent(tools=[], model=model, add_base_tools=True)
 | |
| agent.tools[model_download_tool.name] = model_download_tool
 | |
| ```
 | |
| Now we can leverage the new tool:
 | |
| 
 | |
| ```python
 | |
| agent.run(
 | |
|     "Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub but reverse the letters?"
 | |
| )
 | |
| ```
 | |
| 
 | |
| 
 | |
| > [!TIP]
 | |
| > Beware of not adding too many tools to an agent: this can overwhelm weaker LLM engines.
 | |
| 
 | |
| 
 | |
| ### Use a collection of tools
 | |
| 
 | |
| You can leverage tool collections by using the `ToolCollection` object. It supports loading either a collection from the Hub or an MCP server tools.
 | |
| 
 | |
| #### Tool Collection from a collection in the Hub
 | |
| 
 | |
| You can leverage it with the slug of the collection you want to use.
 | |
| Then pass them as a list to initialize your agent, and start using them!
 | |
| 
 | |
| ```py
 | |
| from smolagents import ToolCollection, CodeAgent
 | |
| 
 | |
| image_tool_collection = ToolCollection.from_hub(
 | |
|     collection_slug="huggingface-tools/diffusion-tools-6630bb19a942c2306a2cdb6f",
 | |
|     token="<YOUR_HUGGINGFACEHUB_API_TOKEN>"
 | |
| )
 | |
| agent = CodeAgent(tools=[*image_tool_collection.tools], model=model, add_base_tools=True)
 | |
| 
 | |
| agent.run("Please draw me a picture of rivers and lakes.")
 | |
| ```
 | |
| 
 | |
| To speed up the start, tools are loaded only if called by the agent.
 | |
| 
 | |
| #### Tool Collection from any MCP server
 | |
| 
 | |
| Leverage tools from the hundreds of MCP servers available on [glama.ai](https://glama.ai/mcp/servers) or [smithery.ai](https://smithery.ai/).
 | |
| 
 | |
| The MCP servers tools can be loaded in a `ToolCollection` object as follow:
 | |
| 
 | |
| ```py
 | |
| from smolagents import ToolCollection, CodeAgent
 | |
| from mcp import StdioServerParameters
 | |
| 
 | |
| server_parameters = StdioServerParameters(
 | |
|     command="uv",
 | |
|     args=["--quiet", "pubmedmcp@0.1.3"],
 | |
|     env={"UV_PYTHON": "3.12", **os.environ},
 | |
| )
 | |
| 
 | |
| with ToolCollection.from_mcp(server_parameters) as tool_collection:
 | |
|     agent = CodeAgent(tools=[*tool_collection.tools], add_base_tools=True)
 | |
|     agent.run("Please find a remedy for hangover.")
 | |
| ``` |