From 396d5b69529e16b8c653f601a933487b6d05dbfc Mon Sep 17 00:00:00 2001 From: Izaak Curry <98251797+ScientistIzaak@users.noreply.github.com> Date: Wed, 1 Jan 2025 21:26:15 -0800 Subject: [PATCH 1/4] Update building_good_agents.md Fixed minor spelling errors. --- docs/source/en/tutorials/building_good_agents.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/source/en/tutorials/building_good_agents.md b/docs/source/en/tutorials/building_good_agents.md index b1df02c..6f128f6 100644 --- a/docs/source/en/tutorials/building_good_agents.md +++ b/docs/source/en/tutorials/building_good_agents.md @@ -26,12 +26,12 @@ In this guide, we're going to see best practices for building agents. ### The best agentic systems are the simplest: simplify the workflow as much as you can -Giving an LLM some agency in your workflow introducessome risk of errors. +Giving an LLM some agency in your workflow introduces some risk of errors. -Well-programmed agentic systems have good error logging and retry mechanisms anyway, so the LLM engine has a chance to self-correct their mistake. But to reduce the risk of LLM error to the maximum, you should simplify your worklow! +Well-programmed agentic systems have good error logging and retry mechanisms anyway, so the LLM engine has a chance to self-correct their mistake. But to reduce the risk of LLM error to the maximum, you should simplify your workflow! Let's take again the example from [intro_agents]: a bot that answers user queries on a surf trip company. -Instead of letting the agent do 2 different calls for "travel distance API" and "weather API" each time they are asked about a new surf spot, you could just make one unified tool "return_spot_information", a functions that calls both APIs at once and returns their concatenated outputs to the user. +Instead of letting the agent do 2 different calls for "travel distance API" and "weather API" each time they are asked about a new surf spot, you could just make one unified tool "return_spot_information", a function that calls both APIs at once and returns their concatenated outputs to the user. This will reduce costs, latency, and error risk! @@ -168,7 +168,7 @@ Final answer: /var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpx09qfsdd/652f0007-3ee9-44e2-94ac-90dae6bb89a4.png ``` The user sees, instead of an image being returned, a path being returned to them. -It could look like a bug from the system, but actually the agentic system didn't cause the error: it's just that the LLM engine tid the mistake of not saving the image output into a variable. +It could look like a bug from the system, but actually the agentic system didn't cause the error: it's just that the LLM engine did the mistake of not saving the image output into a variable. Thus it cannot access the image again except by leveraging the path that was logged while saving the image, so it returns the path instead of an image. The first step to debugging your agent is thus "Use a more powerful LLM". Alternatives like `Qwen2/5-72B-Instruct` wouldn't have made that mistake. @@ -179,7 +179,7 @@ Then you can also use less powerful models but guide them better. Put yourself in the shoes if your model: if you were the model solving the task, would you struggle with the information available to you (from the system prompt + task formulation + tool description) ? -Would you need some added claritications ? +Would you need some added clarifications? To provide extra information, we do not recommend to change the system prompt right away: the default system prompt has many adjustments that you do not want to mess up except if you understand the prompt very well. Better ways to guide your LLM engine are: @@ -217,4 +217,4 @@ agent = CodeAgent( result = agent.run( "How long would a cheetah at full speed take to run the length of Pont Alexandre III?", ) -``` \ No newline at end of file +``` From 8760f50f8edbf07cdb2cd18a35e90a6dedfc8ad5 Mon Sep 17 00:00:00 2001 From: Georgios Balikas Date: Thu, 2 Jan 2025 14:14:42 +0100 Subject: [PATCH 2/4] fix HfApiModel usage example --- src/smolagents/models.py | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/smolagents/models.py b/src/smolagents/models.py index 6fc8dbb..06b54df 100644 --- a/src/smolagents/models.py +++ b/src/smolagents/models.py @@ -206,12 +206,11 @@ class HfApiModel(Model): Example: ```python >>> engine = HfApiModel( - ... model="Qwen/Qwen2.5-Coder-32B-Instruct", + ... model_id="Qwen/Qwen2.5-Coder-32B-Instruct", ... token="your_hf_token_here", - ... max_tokens=2000 ... ) >>> messages = [{"role": "user", "content": "Explain quantum mechanics in simple terms."}] - >>> response = engine(messages, stop_sequences=["END"]) + >>> response = engine(messages, stop_sequences=["END"], max_tokens=1500) >>> print(response) "Quantum mechanics is the branch of physics that studies..." ``` From 0c31f41536e4d26a1bf8c0628230f928957c4ae2 Mon Sep 17 00:00:00 2001 From: Georgios Balikas Date: Thu, 2 Jan 2025 14:15:44 +0100 Subject: [PATCH 3/4] fix docstring --- src/smolagents/models.py | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/smolagents/models.py b/src/smolagents/models.py index 06b54df..ccee8c3 100644 --- a/src/smolagents/models.py +++ b/src/smolagents/models.py @@ -188,14 +188,12 @@ class HfApiModel(Model): This engine allows you to communicate with Hugging Face's models using the Inference API. It can be used in both serverless mode or with a dedicated endpoint, supporting features like stop sequences and grammar customization. Parameters: - model (`str`, *optional*, defaults to `"Qwen/Qwen2.5-Coder-32B-Instruct"`): + model_id (`str`, *optional*, defaults to `"Qwen/Qwen2.5-Coder-32B-Instruct"`): The Hugging Face model ID to be used for inference. This can be a path or model identifier from the Hugging Face model hub. token (`str`, *optional*): Token used by the Hugging Face API for authentication. This token need to be authorized 'Make calls to the serverless Inference API'. If the model is gated (like Llama-3 models), the token also needs 'Read access to contents of all public gated repos you can access'. If not provided, the class will try to use environment variable 'HF_TOKEN', else use the token stored in the Hugging Face CLI configuration. - max_tokens (`int`, *optional*, defaults to 1500): - The maximum number of tokens allowed in the output. timeout (`int`, *optional*, defaults to 120): Timeout for the API request, in seconds. From 3f9cdfd04d1c252e83d8505628bb207cdda5975d Mon Sep 17 00:00:00 2001 From: Shubham Kumar <37694707+SHUBH4M-KUMAR@users.noreply.github.com> Date: Fri, 3 Jan 2025 01:00:21 +0530 Subject: [PATCH 4/4] Update building_good_agents.md fix typing mistake "if" with "of". --- docs/source/en/tutorials/building_good_agents.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/en/tutorials/building_good_agents.md b/docs/source/en/tutorials/building_good_agents.md index b1df02c..d4b152b 100644 --- a/docs/source/en/tutorials/building_good_agents.md +++ b/docs/source/en/tutorials/building_good_agents.md @@ -177,7 +177,7 @@ The first step to debugging your agent is thus "Use a more powerful LLM". Altern Then you can also use less powerful models but guide them better. -Put yourself in the shoes if your model: if you were the model solving the task, would you struggle with the information available to you (from the system prompt + task formulation + tool description) ? +Put yourself in the shoes of your model: if you were the model solving the task, would you struggle with the information available to you (from the system prompt + task formulation + tool description) ? Would you need some added claritications ? @@ -217,4 +217,4 @@ agent = CodeAgent( result = agent.run( "How long would a cheetah at full speed take to run the length of Pont Alexandre III?", ) -``` \ No newline at end of file +```