Improve prompts

2024-12-27 15:23:07 +01:00 · 2024-12-27 15:23:07 +01:00 · 82edb4ed93
parent db64c46e5f
commit 82edb4ed93
6 changed files with 116 additions and 17 deletions
--- a/README.md
+++ b/README.md
@ -53,7 +53,19 @@ agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())
 agent.run("What time would it take for a leopard at full speed to run through Pont des Arts?")
 ```
-> TODO: Add video
+<div class="flex justify-center">
  <video width="320" height="240" controls>
    <source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/smolagents/smolagents.mp4" type="video/mp4">
  </video>
 </div>
    <!-- <img
        class="block dark:hidden"
        src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif"
    />
    <img
        class="hidden dark:block"
        src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/Agent_ManimCE.gif"
    /> -->
 ## Code agents?
@ -69,3 +81,16 @@ Especially, since code execution can be a security concern (arbitrary code execu
 We strived to keep abstractions to a strict minimum, with the main code in `agents.py` being roughly 1,000 lines of code, and still being quite complete, with several types of agents implemented: `CodeAgent` writing its actions in code snippets, and the more classic `ToolCallingAgent` that leverage built-in tool calling methods.
 Many people ask: why use a framework at all? Well, because a big part of this stuff is non-trivial. For instance, the code agent has to keep a consistent format for code throughout its system prompt, its parser, the execution. So our framework handles this complexity for you. But of course we still encourage you to hack into the source code and use only the bits that you need, to the exclusion of everything else!
 ## Citing smolagents
 If you use `smolagents` in your publication, please cite it by using the following BibTeX entry.
 ```bibtex
@Misc{accelerate,
  title =        {Smolagents: The easiest way to build efficient agentic systems.},
  author =       {Aymeric Roucher and Thomas Wolf and Leandro von Werra and Erik Kaunismäki},
  howpublished = {\url{https://github.com/huggingface/smolagents}},
  year =         {2024}
 }
 ```
--- a/examples/tool_calling_agent_from_any_llm.py
+++ b/examples/tool_calling_agent_from_any_llm.py
@ -3,9 +3,11 @@ from smolagents import tool, HfApiModel, TransformersModel, LiteLLMModel
 from typing import Optional
 # Choose which LLM engine to use!
-# model = HfApiModel("meta-llama/Llama-3.3-70B-Instruct")
+# model = HfApiModel(model_id="meta-llama/Llama-3.3-70B-Instruct")
-# model = TransformersModel("meta-llama/Llama-3.2-2B-Instruct")
+# model = TransformersModel(model_id="meta-llama/Llama-3.2-2B-Instruct")
-model = LiteLLMModel("gpt-4o")
+
 # For anthropic: change model_id below to 'anthropic/claude-3-5-sonnet-20240620'
 model = LiteLLMModel(model_id="gpt-4o")
@tool
 def get_weather(location: str, celsius: Optional[bool] = False) -> str:
--- a/src/smolagents/agents.py
+++ b/src/smolagents/agents.py
@ -449,7 +449,7 @@ class MultiStepAgent:
        if additional_args is not None:
            self.state.update(additional_args)
            self.task += f"""
-You have been provided with these additional arguments, that you can access as variables in your python code using the keys:
+You have been provided with these additional arguments, that you can access using the keys as variables in your python code:
 {str(additional_args)}."""
        self.initialize_system_prompt()
--- a/src/smolagents/models.py
+++ b/src/smolagents/models.py
@ -16,6 +16,7 @@
 # limitations under the License.
 from copy import deepcopy
 from enum import Enum
 import json
 from typing import Dict, List, Optional
 from transformers import (
    AutoTokenizer,
@ -128,8 +129,7 @@ def get_clean_message_list(
            final_message_list.append(message)
    return final_message_list
-
+class Model():
 class HfModel:
    def __init__(self):
        self.last_input_token_count = None
        self.last_output_token_count = None
@ -181,7 +181,7 @@ class HfModel:
        return remove_stop_sequences(response, stop_sequences)
-class HfApiModel(HfModel):
+class HfApiModel(Model):
    """A class to interact with Hugging Face's Inference API for language model interaction.
    This engine allows you to communicate with Hugging Face's models using the Inference API. It can be used in both serverless mode or with a dedicated endpoint, supporting features like stop sequences and grammar customization.
@ -280,7 +280,7 @@ class HfApiModel(HfModel):
        return tool_call.function.name, tool_call.function.arguments, tool_call.id
-class TransformersModel(HfModel):
+class TransformersModel(Model):
    """This engine initializes a model and tokenizer from the given `model_id`."""
    def __init__(self, model_id: Optional[str] = None):
@ -401,13 +401,12 @@ class TransformersModel(HfModel):
        return tool_name, tool_input, call_id
-class LiteLLMModel:
+class LiteLLMModel(Model):
    def __init__(self, model_id="anthropic/claude-3-5-sonnet-20240620"):
        super().__init__()
        self.model_id = model_id
        # IMPORTANT - Set this to TRUE to add the function to the prompt for Non OpenAI LLMs
        litellm.add_function_to_prompt = True
        self.last_input_token_count = 0
        self.last_output_token_count = 0
    def __call__(
        self,
@ -451,7 +450,8 @@ class LiteLLMModel:
        tool_calls = response.choices[0].message.tool_calls[0]
        self.last_input_token_count = response.usage.prompt_tokens
        self.last_output_token_count = response.usage.completion_tokens
-        return tool_calls.function.name, tool_calls.function.arguments, tool_calls.id
+        arguments = json.loads(tool_calls.function.arguments)
        return tool_calls.function.name, arguments, tool_calls.id
 __all__ = [
--- a/src/smolagents/monitoring.py
+++ b/src/smolagents/monitoring.py
@ -29,6 +29,12 @@ class Monitor:
            self.total_input_token_count = 0
            self.total_output_token_count = 0
    def get_total_token_counts(self):
        return {
            "input": self.total_input_token_count,
            "output": self.total_output_token_count
        }
    def reset(self):
        self.step_durations = []
        self.total_input_token_count = 0
--- a/src/smolagents/prompts.py
+++ b/src/smolagents/prompts.py
@ -27,7 +27,10 @@ Tools:
 Examples:
 ---
-Task: "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French."
+Task:
 "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.
 You have been provided with these additional arguments, that you can access using the keys as variables in your python code:
 {'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}"
 Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
 Code:
@ -254,6 +257,69 @@ result = 5 + 3 + 1294.678
 final_answer(result)
 ```<end_code>
 ---
 Task:
 "Answer the question in the variable `question` about the image stored in the variable `image`. The question is in French.
 You have been provided with these additional arguments, that you can access using the keys as variables in your python code:
 {'question': 'Quel est l'animal sur l'image?', 'image': 'path/to/image.jpg'}"
 Thought: I will use the following tools: `translator` to translate the question into English and then `image_qa` to answer the question on the input image.
 Code:
 ```py
 translated_question = translator(question=question, src_lang="French", tgt_lang="English")
 print(f"The translated question is {translated_question}.")
 answer = image_qa(image=image, question=translated_question)
 final_answer(f"The answer is {answer}")
 ```<end_code>
 ---
 Task:
 In a 1979 interview, Stanislaus Ulam discusses with Martin Sherwin about other great physicists of his time, including Oppenheimer.
 What does he say was the consequence of Einstein learning too much math on his creativity, in one word?
 Thought: I need to find and read the 1979 interview of Stanislaus Ulam with Martin Sherwin.
 Code:
 ```py
 pages = search(query="1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein")
 print(pages)
 ```<end_code>
 Observation:
 No result found for query "1979 interview Stanislaus Ulam Martin Sherwin physicists Einstein".
 Thought: The query was maybe too restrictive and did not find any results. Let's try again with a broader query.
 Code:
 ```py
 pages = search(query="1979 interview Stanislaus Ulam")
 print(pages)
 ```<end_code>
 Observation:
 Found 6 pages:
 [Stanislaus Ulam 1979 interview](https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/)
 [Ulam discusses Manhattan Project](https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/)
 (truncated)
 Thought: I will read the first 2 pages to know more.
 Code:
 ```py
 for url in ["https://ahf.nuclearmuseum.org/voices/oral-histories/stanislaus-ulams-interview-1979/", "https://ahf.nuclearmuseum.org/manhattan-project/ulam-manhattan-project/"]:
    whole_page = visit_webpage(url)
    print(whole_page)
    print("\n" + "="*80 + "\n")  # Print separator between pages
 ```<end_code>
 Observation:
 Manhattan Project Locations:
 Los Alamos, NM
 Stanislaus Ulam was a Polish-American mathematician. He worked on the Manhattan Project at Los Alamos and later helped design the hydrogen bomb. In this interview, he discusses his work at
 (truncated)
 Thought: I now have the final answer: from the webpages visited, Stanislaus Ulam says of Einstein: "He learned too much mathematics and sort of diminished, it seems to me personally, it seems to me his purely physics creativity." Let's answer in one word.
 Code:
 ```py
 final_answer("diminished")
 ```<end_code>
 ---
 Task: "Which city has the highest population: Guangzhou or Shanghai?"
@ -285,12 +351,12 @@ pope_age_search = web_search(query="current pope age")
 print("Pope age as per google search:", pope_age_search)
 ```<end_code>
 Observation:
-Pope age: "The pope Francis is currently 85 years old."
+Pope age: "The pope Francis is currently 88 years old."
-Thought: I know that the pope is 85 years old. Let's compute the result using python code.
+Thought: I know that the pope is 88 years old. Let's compute the result using python code.
 Code:
 ```py
-pope_current_age = 85 ** 0.36
+pope_current_age = 88 ** 0.36
 final_answer(pope_current_age)
 ```<end_code>