On the Tool Manipulation Capability of Open-source Large Language Models

by   Qiantong Xu, et al.

Recent studies on software tool manipulation with large language models (LLMs) mostly rely on closed model APIs. The industrial adoption of these models is substantially constrained due to the security and robustness risks in exposing information to closed LLM API services. In this paper, we ask can we enhance open-source LLMs to be competitive to leading closed LLM APIs in tool manipulation, with practical amount of human supervision. By analyzing common tool manipulation failures, we first demonstrate that open-source LLMs may require training with usage examples, in-context demonstration and generation style regulation to resolve failures. These insights motivate us to revisit classical methods in LLM literature, and demonstrate that we can adapt them as model alignment with programmatic data generation, system prompts and in-context demonstration retrievers to enhance open-source LLMs for tool manipulation. To evaluate these techniques, we create the ToolBench, a tool manipulation benchmark consisting of diverse software tools for real-world tasks. We demonstrate that our techniques can boost leading open-source LLMs by up to 90 out of 8 ToolBench tasks. We show that such enhancement typically requires about one developer day to curate data for each tool, rendering a recipe with practical amount of human supervision.


page 1

page 2

page 3

page 4


ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Despite the advancements of open-source large language models (LLMs) and...

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Large language models (LLMs) have recently demonstrated remarkable capab...

Open, Closed, or Small Language Models for Text Classification?

Recent advancements in large language models have demonstrated remarkabl...

Visualizing Attention in Transformer-Based Language models

We present an open-source tool for visualizing multi-head self-attention...

Leveraging Language for Accelerated Learning of Tool Manipulation

Robust and generalized tool manipulation requires an understanding of th...

Halo: Estimation and Reduction of Hallucinations in Open-Source Weak Large Language Models

Large Language Models (LLMs) have revolutionized Natural Language Proces...

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Augmenting large language models (LLMs) with external tools has emerged ...

Please sign up or login with your details

Forgot password? Click here to reset