CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation

by   Cheng Qian, et al.

Large Language Models (LLMs) have demonstrated significant progress in utilizing external APIs as tools for various tasks. However, their tool-using ability is limited by the availability of suitable APIs and the instability of implicit reasoning, particularly when simultaneously engaging in reasoning about plans and actual calculations. To address these limitations, we propose CREATOR, a novel framework that empowers LLMs to create their own tools through documentation and code realization. CREATOR disentangles the LLM's ability into two distinct phases: abstract tool creation and concrete decision execution, which results in improved LLM performance. We evaluate CREATOR on two established benchmarks: MATH, which consists of challenging math competition problems, and TabMWP, which includes diverse tabular contents for problem-solving. Remarkably, CREATOR significantly outperforms existing chain-of-thought (CoT), program-of-thought (PoT), and tool-using baselines on these two benchmarks. Additionally, we present a new dataset, Creation Challenge, comprising 2K diverse questions, to highlight the necessity and benefits of LLMs' tool creation ability in effectively addressing these problems. Furthermore, our research reveals that leveraging LLMs as tool creators facilitates knowledge transfer, and LLMs exhibit varying levels of tool creation abilities, enabling them to flexibly tackle diverse situations. Our study represents a promising avenue for maximizing the potential of LLMs and advancing toward truly intelligent and adaptable AI systems.


page 4

page 6

page 9

page 10

page 11


MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Large language models (LLMs) have achieved impressive performance on var...

Large Language Models as Tool Makers

Recent research shows the potential of enhancing the problem-solving abi...

ToolQA: A Dataset for LLM Question Answering with External Tools

Large Language Models (LLMs) have demonstrated impressive performance in...

Structural Embeddings of Tools for Large Language Models

It is evident that the current state of Large Language Models (LLMs) nec...

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings

Augmenting large language models (LLMs) with external tools has emerged ...

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

We introduce MAmmoTH, a series of open-source large language models (LLM...

Orchestrating Tool Chains for Model-based Systems Engineering with RCE

When using multiple software tools to analyze, visualize, or optimize mo...

Please sign up or login with your details

Forgot password? Click here to reset