smolagents/building_agents.md at 0a0402d090c87fb816b445c7f4be2900fdc9811f

11 KiB

Raw Blame History

The best agentic systems are the simplest: simplify the workflow as much as you can

Giving an LLM some agency in your workflow introducessome risk of errors.

Well-programmed agentic systems have good error logging and retry mechanisms anyway, so the LLM engine has a chance to self-correct their mistake. But to reduce the risk of LLM error to the maximum, you should simplify your worklow!

Let's take again the example from [intro_agents]: a bot that answers user queries on a surf trip company. Instead of letting the agent do 2 different calls for "travel distance API" and "weather API" each time they are asked about a new surf spot, you could just make one unified tool "return_spot_information", a functions that calls both APIs at once and returns their concatenated outputs to the user.

This will reduce costs, latency, and error risk!

So our first actionable takeaway is you should group tools if possible

Improve the information flow to the LLM engine

Remember that your LLM engine is like a ~intelligent~ robot, tapped into a room with the only communication with the outside world being notes passed under a door.

It won't know of anything that happened if you don't explicitly put that into its prompt.

For a CodeAgent using variables, it cannot access any varible not saved into its state. For instance check out this agent trace for an LLM that I asked to make me a car picture:

==================================================================================================== New task ====================================================================================================
Make me a cool car picture
──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────
Agent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
image_generator(prompt="A cool, futuristic sports car with LED headlights, aerodynamic design, and vibrant color, high-res, photorealistic")
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Last output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png
Step 1:

- Time taken: 16.35 seconds
- Input tokens: 1,383
- Output tokens: 77
──────────────────────────────────────────────────────────────────────────────────────────────────── New step ────────────────────────────────────────────────────────────────────────────────────────────────────
Agent is executing the code below: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
final_answer("/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png")
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Print outputs:

Last output from code snippet: ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png
Final answer:
/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png

The LLM never explicitly saved the image output into a variable, so it cannot access it again except by leveraging the path that was logged while saving the image. So properly logging the code execution made a big difference!

Particular guidelines to follow:

Each tool should log (by simply using print statements inside the tool's forward method) everything that could be useful for the LLM engine.
- In particular, logging detail on tool execution errors would help a lot!

For instance, here's a tool that :

First, here's a poor version:

from my_weather_api return convert_location_to_coordinates, get_weather_report_at_coordinates
# Let's say "get_weather_report_at_coordinates" returns a list of [temperature in °C, risk of rain on a scale 0-1, wave height in m]
import datetime

@tool
def get_weather_api(location (str), date_time: str) -> str:
    """
    Returns the weather report.

    Args:
        - location (`str`): the name of the place that you want the weather for.
        - date_time (`str`): the date and time for which you want the report.
    """
    lon, lat = convert_location_to_coordinates(location)
    date_time = datetime.strptime(date_time)
    return str(get_weather_report_at_coordinates((lon, lat), date_time))

Why is it bad?

there's no precision of the format that should be used for date_time
there's no detail on how location should
there's no logging mechanism tying to explicit failure cases like location not being in a proper format, or date_time not being properly formatted.
the output format is hard to understand

If the tool call fails, the error trace logged in memory can help the LLM reverse engineer the tool to fix the errors. But why leave it so much heavy lifting to do?

Here's a better way to build this tool:

from my_weather_api return convert_location_to_coordinates, get_weather_report_at_coordinates
# Let's say "get_weather_report_at_coordinates" returns a list of [temperature in °C, risk of rain on a scale 0-1, wave height in m]
import datetime

@tool
def get_weather_api(location (str), date_time: str) -> str:
    """
    Returns the weather report.

    Args:
        - location (`str`): the name of the place that you want the weather for. Should be a place name, followed by possibly a city name, then a country, like "Anchor Point, Taghazout, Morocco".
        - date_time (`str`): the date and time for which you want the report, formatted as '%m/%d/%y %H:%M:%S'.
    """
    lon, lat = convert_location_to_coordinates(location)
    try:
        date_time = datetime.strptime(date_time)
    except Exception as e:
        raise ValueError("Conversion of `date_time` to datetime format failed, make sure to provide a string in format '%m/%d/%y %H:%M:%S'. Full trace:" + str(e))
    temperature_celsius, risk_of_rain, wave_height = get_weather_report_at_coordinates((lon, lat), date_time)
    return f"Weather report for {location}, {date_time}: Temperature will be {temperature_celsius}°C, risk of rain is {risk_of_rain*100:.0f}%, wave height is {wave_height}m."

11 KiB Raw Blame History

The best agentic systems are the simplest: simplify the workflow as much as you can

Improve the information flow to the LLM engine

11 KiB

Raw Blame History