Clean benchmark
This commit is contained in:
parent
ea77a7716b
commit
8646697c73
|
@ -75,7 +75,7 @@ By the way, why use a framework at all? Well, because a big part of this stuff i
|
|||
|
||||
We've created [`CodeAgent`](https://huggingface.co/docs/smolagents/reference/agents#smolagents.CodeAgent) instances with some leading models, and compared them on [this benchmark](https://huggingface.co/datasets/m-ric/agents_medium_benchmark_2) that gathers questions from a few different benchmarks to propose a varied blend of challenges.
|
||||
|
||||
[Find the benchmark here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of code agents versus tool calling agents (spoilers: code works better).
|
||||
[Find the benchmarking code here](https://github.com/huggingface/smolagents/blob/main/examples/benchmark.ipynb) for more detail on the agentic setup used, and see a comparison of code agents versus tool calling agents (spoilers: code works better).
|
||||
|
||||
<p align="center">
|
||||
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/smolagents/benchmark_code_agents.png" alt="benchmark of different models on agentic workflows" width=70%>
|
||||
|
|
81716
examples/benchmark.ipynb
81716
examples/benchmark.ipynb
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue