From 5991206ae52b1d6d485ffeb918d765c11071cdd1 Mon Sep 17 00:00:00 2001 From: Aymeric Date: Tue, 31 Dec 2024 20:10:17 +0100 Subject: [PATCH] Clarify readme about benchmark --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b488e82..80468c5 100644 --- a/README.md +++ b/README.md @@ -75,7 +75,7 @@ By the way, why use a framework at all? Well, because a big part of this stuff i We've created [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) instances with some leading models, and compared them on [this benchmark](https://huggingface.co/datasets/m-ric/agents_medium_benchmark_2) that gathers questions from a few different benchmarks to propose a varied blend of challenges. -[Find the benchmarking code here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of code agents versus tool calling agents (spoilers: code works better). +[Find the benchmarking code here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of using LLMs code agents compared to vanilla (spoilers: code agents works better).

benchmark of different models on agentic workflows