Publication | IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2022
CLIP-Forge
Towards Zero-Shot Text-to-Shape Generation
Our approach is among the pioneering techniques that can convert text to 3D shapes without the need for costly inference time optimization. Furthermore, it enables the production of multiple shapes for a given text.
This paper was presented at the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)
The dataset for this paper is available at Autodesk AI Lab on Github.
Download publicationAbstract
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation
Aditya Sanghi, Hang Chu, Joseph G. Lambourne, Ye Wang, Chin-Yi Cheng, Marco Fumero, Kamal Rahimi Malekshan
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2022
Generating shapes using natural language can enable new ways of imagining and creating the things around us. While significant recent progress has been made in text-to-image generation, text-to-shape generation remains a challenging problem due to the unavailability of paired text and shape data at a large scale. We present a simple yet effective method for zero-shot text-to-shape generation that circumvents such data scarcity. Our proposed method, named CLIP-Forge, is based on a two-stage training process, which only depends on an unlabelled shape dataset and a pre-trained image-text network such as CLIP. Our method has the benefits of avoiding expensive inference time optimization, as well as the ability to generate multiple shapes for a given text. We not only demonstrate promising zero-shot generalization of the CLIP-Forge model qualitatively and quantitatively, but also provide extensive comparative evaluations to better understand its behavior.
Related Resources
2024
Wavelet Latent Diffusion: Billion-Parameter 3D Generative Model with Compact Wavelet EncodingsAddressing a common limitation of generative AI models, WaLa encodes…
2023
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape GenerationGenerative model that can synthesize consistent 3D shapes from a…
2023
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows3DALL-E integrated three large AI models within Fusion 360 to explore…
2022
Assemble Them All: Physics-Based Planning for Generalizable Assembly by DisassemblyThis work proposes a novel method to efficiently plan physically…
Get in touch
Something pique your interest? Get in touch if you’d like to learn more about Autodesk Research, our projects, people, and potential collaboration opportunities.
Contact us