Empowering LLM to use Smartphone for Intelligent Task Automation

08/29/2023
by   Hao Wen, et al.
0

Mobile task automation is an attractive technique that aims to enable voice-based hands-free user interaction with smartphones. However, existing approaches suffer from poor scalability due to the limited language understanding ability and the non-trivial manual efforts required from developers or end-users. The recent advance of large language models (LLMs) in language understanding and reasoning inspires us to rethink the problem from a model-centric perspective, where task preparation, comprehension, and execution are handled by a unified language model. In this work, we introduce AutoDroid, a mobile task automation system that can handle arbitrary tasks on any Android application without manual efforts. The key insight is to combine the commonsense knowledge of LLMs and domain-specific knowledge of apps through automated dynamic analysis. The main components include a functionality-aware UI representation method that bridges the UI with the LLM, exploration-based memory injection techniques that augment the app-specific domain knowledge of LLM, and a multi-granularity query optimization module that reduces the cost of model inference. We integrate AutoDroid with off-the-shelf LLMs including online GPT-4/GPT-3.5 and on-device Vicuna, and evaluate its performance on a new benchmark for memory-augmented Android task automation with 158 common tasks. The results demonstrated that AutoDroid is able to precisely generate actions with an accuracy of 90.9 71.3 benchmark suites, and source code of AutoDroid will be released at urlhttps://autodroid-sys.github.io/.

READ FULL TEXT
research
11/30/2017

Automating Release of Deep Link APIs for Android Applications

Unlike the Web where each web page has a global URL to reach, a specific...
research
04/14/2023

DroidBot-GPT: GPT-powered UI Automation for Android

This paper introduces DroidBot-GPT, a tool that utilizes GPT-like large ...
research
12/20/2022

PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English

Privacy policies provide individuals with information about their rights...
research
09/10/2021

Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding

Large-scale, pre-trained language models (LMs) have achieved human-level...
research
06/02/2023

Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators

The recent success of Large Language Models (LLMs) signifies an impressi...
research
11/04/2019

VASTA: A Vision and Language-assisted Smartphone Task Automation System

We present VASTA, a novel vision and language-assisted Programming By De...

Please sign up or login with your details

Forgot password? Click here to reset