Clean benchmark

2024-12-31 18:30:11 +01:00 · 2024-12-31 18:30:11 +01:00 · 8646697c73
parent ea77a7716b
commit 8646697c73
2 changed files with 4 additions and 81714 deletions
--- a/README.md
+++ b/README.md
@ -75,7 +75,7 @@ By the way, why use a framework at all? Well, because a big part of this stuff i

 We've created [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) instances with some leading models, and compared them on [this benchmark](https://huggingface.co/datasets/m-ric/agents_medium_benchmark_2) that gathers questions from a few different benchmarks to propose a varied blend of challenges.

-[Find the benchmark here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of code agents versus tool calling agents (spoilers: code works better).
+[Find the benchmarking code here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of code agents versus tool calling agents (spoilers: code works better).

 <p align="center">
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/smolagents/benchmark_code_agents.png" alt="benchmark of different models on agentic workflows" width=70%>
--- a/examples/benchmark.ipynb
+++ b/examples/benchmark.ipynb