AvatarCLIP: Zero-shot Text-driven Generation and Animation of 3D Avatars
DescriptionWe present AvatarCLIP, a zero-shot, text-driven framework for 3D avatar generation and animation. Taking advantage of the powerful vision-language model CLIP, AvatarCLIP empowers users to customize a 3D avatar with the desired shape and texture and drive the avatar with the described motions using solely natural languages.