ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model

by   Rao Fu, et al.
Brown University

We present ShapeCrafter, a neural network for recursive text-conditioned 3D shape generation. Existing methods to generate text-conditioned 3D shapes consume an entire text prompt to generate a 3D shape in a single step. However, humans tend to describe shapes recursively-we may start with an initial description and progressively add details based on intermediate results. To capture this recursive process, we introduce a method to generate a 3D shape distribution, conditioned on an initial phrase, that gradually evolves as more phrases are added. Since existing datasets are insufficient for training this approach, we present Text2Shape++, a large dataset of 369K shape-text pairs that supports recursive shape generation. To capture local details that are often used to refine shape descriptions, we build on top of vector-quantized deep implicit functions that generate a distribution of high-quality shapes. Results show that our method can generate shapes consistent with text descriptions, and shapes evolve gradually as more phrases are added. Our method supports shape editing, extrapolation, and can enable new applications in human-machine collaboration for creative design.


page 7

page 15

page 16


Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

With the rising industrial attention to 3D virtual modeling technology, ...

Looking at words and points with attention: a benchmark for text-to-shape coherence

While text-conditional 3D object generation and manipulation have seen r...

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation

Semantic-driven 3D shape generation aims to generate 3D objects conditio...

Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation

Diffusion probabilistic models have achieved remarkable success in text ...

Towards Better Adversarial Synthesis of Human Images from Text

This paper proposes an approach that generates multiple 3D human meshes ...

T2TD: Text-3D Generation Model based on Prior Knowledge Guidance

In recent years, 3D models have been utilized in many applications, such...

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

In this work, we present a novel framework built to simplify 3D asset ge...

Please sign up or login with your details

Forgot password? Click here to reset