awesome-conditional-content-generation

awesome-conditional-content-generation

条件内容生成的前沿技术与资源集锦

这是一个综合性的条件内容生成资源库,主要聚焦于人体动作、图像和视频生成领域。项目汇集了最新研究论文和代码实现,包括音乐、文本和音频驱动的动作生成,以及人体动作预测等多个研究方向。同时还收录了条件图像和视频生成的相关资源,为该领域的研究和开发提供了丰富的参考材料。

人工智能内容生成动作生成图像生成视频生成Github开源项目

awesome-conditional-content-generation Awesome

This repository contains a collection of resources and papers on Conditional Content Generation. Especially for human motion generation, image generation and video generation. This repo is maintained by Haofan Wang.

如果你对可控内容生成(2D/3D)方向感兴趣,希望与我保持更广泛的学术合作或寻求一份实习,并且已经发表过至少一篇顶会论文,欢迎随时发邮件至haofanwang.ai@gmail.com,高校、工业界均欢迎。

Contents

Papers

Music-Driven motion generation

Taming Diffusion Models for Music-driven Conducting Motion Generation
NUS, AAAI 2023 Summer Symposium, [Code]

Music-Driven Group Choreography
AIOZ AI, CVPR'23

Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation
Illinois Institute of Technology, ICLR'23, [Code]

Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
Tsinghua University, 7 Dec 2022

Pretrained Diffusion Models for Unified Human Motion Synthesis
DAMO Academy, Alibaba Group, 6 Dec 2022

EDGE: Editable Dance Generation From Music
Stanford University, 19 Nov 2022

You Never Stop Dancing: Non-freezing Dance Generation via Bank-constrained Manifold Projection
MSRA, NeurIPS'22

GroupDancer: Music to Multi-People Dance Synthesis with Style Collaboration
Tsinghua University, ACMMM'22

A Brand New Dance Partner: Music-Conditioned Pluralistic Dancing Controlled by Multiple Dance Genres
Yonsei University, CVPR 2022, [Code]

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory
NTU, CVPR 2022 (Oral), [Code]

Dance Style Transfer with Cross-modal Transformer
KTH, 22 Aug 2022, [Upcoming Code]

Music-driven Dance Regeneration with Controllable Key Pose Constraints
Tencent, 8 July 2022

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++
USC, ICCV 2021, [Code]

Text-Driven motion generation

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model
NTU, CVPR'23, [Code]

TEMOS: Generating diverse human motions from textual descriptions
ENPC, CVPR'23

GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents
Peking University, CVPR'23

Human Motion Diffusion as a Generative Prior
Anonymous Authors, [Code]

T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
Tencent AI Lab, 16 Jan 2023, [Code]

Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models
Beihang University, 10 Jan 2023

Executing your Commands via Motion Diffusion in Latent Space
Tencent, 8 Dec 2022, [Code]

MultiAct: Long-Term 3D Human Motion Generation from Multiple Action Labels
Seoul National University, AAAI 2023 Oral, [Code]

MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Max Planck Institute for Informatics, 8 Dec 2022

Executing your Commands via Motion Diffusion in Latent Space
Tencent PCG, 8 Dec 2022, [Upcoming Code]

UDE: A Unified Driving Engine for Human Motion Generation
Xiaobing Inc, 29 Nov 2022, [Upcoming Code]

MotionBERT: Unified Pretraining for Human Motion Analysis
SenseTime Research, 12 Oct 2022, [Code]

Human Motion Diffusion Model
Tel Aviv University, 3 Oct 2022, [Code]

FLAME: Free-form Language-based Motion Synthesis & Editing
Korea University, 1 Sep 2022

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
NTU, 22 Aug 2022, [Code]

TEMOS: Generating diverse human motions from textual descriptions
MPI, ECCV 2022 (Oral), [Code]

GIMO: Gaze-Informed Human Motion Prediction in Context
Stanford University, ECCV 2022, [Code]

MotionCLIP: Exposing Human Motion Generation to CLIP Space
Tel Aviv University, ECCV 2022, [Code]

Generating Diverse and Natural 3D Human Motions from Text
University of Alberta, CVPR 2022, [Code]

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
NTU, SIGGRAPH 2022, [Code]

Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gestures for Virtual Agents
University of Maryland,, VR 2021, [Code]

Audio-Driven motion generation

For more recent paper, you can find from here

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
NTU, CVPR'23, [Code]

GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Zhejiang University, ICLR'23, [Code]

DiffMotion: Speech-Driven Gesture Synthesis Using Denoising Diffusion Model
Macau University of Science and Technolog, 24 Jan 2023

DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis
Tsinghua University, 10 Jan 2023

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation
University of Wrocław, 6 Jan 2023, [Incoming Code]

Generating Holistic 3D Human Motion from Speech
Max Planck Institute for Intelligent Systems, 8 Dev 2022

Audio-Driven Co-Speech Gesture Video Generation
NTU, 5 Dec 2022

Listen, denoise, action! Audio-driven motion synthesis with diffusion models
KTH Royal Institute of Technology, 17 Nov 2022

ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech
York University, 23 Sep 2022, [Code]

BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis
The University of Tokyo, ECCV 2022, [Code]

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Nanjing University, SIGGRAPH 2022, [Code]

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
The Chinese University of Hong Kong, CVPR 2022, [Code]

SEEG: Semantic Energized Co-speech Gesture Generation
Alibaba DAMO Academy, CVPR 2022, [Code]

FaceFormer: Speech-Driven 3D Facial Animation with Transformers
The University of Hong Kong, CVPR 2022, [Code]

Freeform Body Motion Generation from Speech
JD AI Research, 4 Mar 2022, [Code]

Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders
Tencent AI Lab, ICCV 2021, [Code]

Learning Speech-driven 3D Conversational Gestures from Video
Max Planck Institute for Informatics, IVA 2021, [Code]

Learning Individual Styles of Conversational Gesture
UC Berkeley, CVPR 2019, [Code]

Human motion prediction

For more recent more, you can find from here

InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion
UIUC, ICCV 2023, [Code]

Stochastic Multi-Person 3D Motion Forecasting
UIUC, ICLR 2023 (Spotlight), [Code]

HumanMAC: Masked Motion Completion for Human Motion Prediction
Tsinghua University, ICCV 2023, [Code]

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
University of Barcelona, 25 Nov 2022, [Upcoming Code]

Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors
UIUC, ECCV 2022 (Oral), [Code]

PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting
NAVER LABS, ECCV'2022, [Code]

NeMF: Neural Motion Fields for Kinematic Animation
Yale University, NeurIPS 2022 (Spotlight), [Code]

Multi-Person Extreme Motion Prediction
Inria University, CVPR 2022, [Code]

MotionMixer: MLP-based 3D Human Body Pose Forecasting
Mercedes-Benz, IJCAI 2022 (Oral), [Code]

Multi-Person 3D Motion Prediction with Multi-Range Transformers
UCSD, NeurIPS 2021

Motion Applications

MIME: Human-Aware 3D Scene Generation
MPI

Scene Synthesis from Human Motion
Stanford University, SIGGRAPH Asia 2022, [Code]

TEACH: Temporal Action Compositions for 3D Humans
MPI, 3DV 2022,

编辑推荐精选

堆友

堆友

多风格AI绘画神器

堆友平台由阿里巴巴设计团队创建,作为一款AI驱动的设计工具,专为设计师提供一站式增长服务。功能覆盖海量3D素材、AI绘画、实时渲染以及专业抠图,显著提升设计品质和效率。平台不仅提供工具,还是一个促进创意交流和个人发展的空间,界面友好,适合所有级别的设计师和创意工作者。

图像生成热门AI工具AI图像AI反应堆AI工具箱AI绘画GOAI艺术字堆友相机
码上飞

码上飞

零代码AI应用开发平台

零代码AI应用开发平台,用户只需一句话简单描述需求,AI能自动生成小程序、APP或H5网页应用,无需编写代码。

Vora

Vora

免费创建高清无水印Sora视频

Vora是一个免费创建高清无水印Sora视频的AI工具

Refly.AI

Refly.AI

最适合小白的AI自动化工作流平台

无需编码,轻松生成可复用、可变现的AI自动化工作流

酷表ChatExcel

酷表ChatExcel

大模型驱动的Excel数据处理工具

基于大模型交互的表格处理系统,允许用户通过对话方式完成数据整理和可视化分析。系统采用机器学习算法解析用户指令,自动执行排序、公式计算和数据透视等操作,支持多种文件格式导入导出。数据处理响应速度保持在0.8秒以内,支持超过100万行数据的即时分析。

AI工具使用教程AI营销产品酷表ChatExcelAI智能客服
TRAE编程

TRAE编程

AI辅助编程,代码自动修复

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

热门AI工具生产力协作转型TraeAI IDE
AIWritePaper论文写作

AIWritePaper论文写作

AI论文写作指导平台

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

数据安全AI助手热门AI工具AI辅助写作AI论文工具论文写作智能生成大纲
博思AIPPT

博思AIPPT

AI一键生成PPT,就用博思AIPPT!

博思AIPPT,新一代的AI生成PPT平台,支持智能生成PPT、AI美化PPT、文本&链接生成PPT、导入Word/PDF/Markdown文档生成PPT等,内置海量精美PPT模板,涵盖商务、教育、科技等不同风格,同时针对每个页面提供多种版式,一键自适应切换,完美适配各种办公场景。

热门AI工具AI办公办公工具智能排版AI生成PPT博思AIPPT海量精品模板AI创作
潮际好麦

潮际好麦

AI赋能电商视觉革命,一站式智能商拍平台

潮际好麦深耕服装行业,是国内AI试衣效果最好的软件。使用先进AIGC能力为电商卖家批量提供优质的、低成本的商拍图。合作品牌有Shein、Lazada、安踏、百丽等65个国内外头部品牌,以及国内10万+淘宝、天猫、京东等主流平台的品牌商家,为卖家节省将近85%的出图成本,提升约3倍出图效率,让品牌能够快速上架。

iTerms

iTerms

企业专属的AI法律顾问

iTerms是法大大集团旗下法律子品牌,基于最先进的大语言模型(LLM)、专业的法律知识库和强大的智能体架构,帮助企业扫清合规障碍,筑牢风控防线,成为您企业专属的AI法律顾问。

下拉加载更多