Awesome 3D AIGC Resources

A curated list of papers and open-source resources focused on 3D AIGC, intended to keep pace with the anticipated surge of research in the coming months. If you have any additions or suggestions, feel free to contribute. Additional resources like blog posts, videos, etc. are also welcome.

Survey
Text to 3D Generation
Image to 3D Generation
Audio to 3D Generation
3D Editing
Human Avatar Generation
City/Autonomous Driving
SLAM
BioMedical
4D AIGC
Misc
Open Source Implementations
Blog Posts
Tutorial Videos
Credits

<details span> <summary>Update Log:</summary> <be>

Mar. 4, 2024: Update several CVPR 2024 papers.

Jan 23, 2024: Update several ICLR 2024 papers.

Jan 19, 2024: Update several ICLR 2024 papers.

Jan 11, 2024: Add AGG and recent papers.

Jan 10, 2024: Add DreamGaussian (3D version) and several avatar papers.

Jan 6, 2024: Add recent papers.

Jan 2, 2024: Add papers to image to 3d generation.

Dec 29, 2023: Contribute to the section on text-to-3d by adding new papers with their publication years.

Dec 27, 2023: Initial list with first 15 papers.

</details> <be> <div align=center><img src="https://github.com/mdyao/Awesome-3D-AIGC/assets/33108887/2bee41c0-b19c-4047-ae26-02ca2af2c38f"/></div>

Survey:

1. Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era [arxiv 2023.05]

Authors: Chenghao Li, Chaoning Zhang, Atish Waghwase, Lik-Hang Lee, Francois Rameau, Yang Yang, Sung-Ho Bae, Choong Seon Hong

<details span> <summary>Abstract</summary> Generative AI (AIGC, a.k.a. AI generated content) has made remarkable progress in the past few years, among which text-guided content generation is the most practical one since it enables the interaction between human instruction and AIGC. Due to the development in text-to-image as well 3D modeling technologies (like NeRF), text-to-3D has become a newly emerging yet highly active research field. Our work conducts the first yet comprehensive survey on text-to-3D to help readers interested in this direction quickly catch up with its fast development. First, we introduce 3D data representations, including both Euclidean data and non-Euclidean data. On top of that, we introduce various foundation technologies as well as summarize how recent works combine those foundation technologies to realize satisfactory text-to-3D. Moreover, we summarize how text-to-3D technology is used in various applications, including avatar generation, texture generation, shape transformation, and scene generation. </details>

📄 Paper

2. Deep Generative Models on 3D Representations: A Survey [arxiv 2023.10]

Authors: Zifan Shi, Sida Peng, Yinghao Xu, Andreas Geiger, Yiyi Liao, Yujun Shen

<details span> <summary>Abstract</summary> Generative models aim to learn the distribution of observed data by generating new instances. With the advent of neural networks, deep generative models, including variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models (DMs), have progressed remarkably in synthesizing 2D images. Recently, researchers started to shift focus from 2D to 3D space, considering that 3D data is more closely aligned with our physical world and holds immense practical potential. However, unlike 2D images, which possess an inherent and efficient representation (\textit{i.e.}, a pixel grid), representing 3D data poses significantly greater challenges. Ideally, a robust 3D representation should be capable of accurately modeling complex shapes and appearances while being highly efficient in handling high-resolution data with high processing speeds and low memory requirements. Regrettably, existing 3D representations, such as point clouds, meshes, and neural fields, often fail to satisfy all of these requirements simultaneously. In this survey, we thoroughly review the ongoing developments of 3D generative models, including methods that employ 2D and 3D supervision. Our analysis centers on generative models, with a particular focus on the representations utilized in this context. We believe our survey will help the community to track the field's evolution and to spark innovative ideas to propel progress towards solving this challenging task. </details>

📄 Paper | 🌐 Project Page

3. A survey of deep learning-based 3D shape generation [Computational Visual Media 2023.05]

Authors: Qun-Ce Xu, Tai-Jiang Mu, Yong-Liang Yang

<details span> <summary>Abstract</summary> Deep learning has been successfully used for tasks in the 2D image domain. Research on 3D computer vision and deep geometry learning has also attracted attention. Considerable achievements have been made regarding feature extraction and discrimination of 3D shapes. Following recent advances in deep generative models such as generative adversarial networks, effective generation of 3D shapes has become an active research topic. Unlike 2D images with a regular grid structure, 3D shapes have various representations, such as voxels, point clouds, meshes, and implicit functions. For deep learning of 3D shapes, shape representation has to be taken into account as there is no unified representation that can cover all tasks well. Factors such as the representativeness of geometry and topology often largely affect the quality of the generated 3D shapes. In this survey, we comprehensively review works on deep-learning-based 3D shape generation by classifying and discussing them in terms of the underlying shape representation and the architecture of the shape generator. The advantages and disadvantages of each class are further analyzed. We also consider the 3D shape datasets commonly used for shape generation. Finally, we present several potential research directions that hopefully can inspire future works on this topic. </details>

📄 Paper

4. Learning Generative Models of 3D Structures [Computer Graphics Forum 2020.05]

Authors: Siddhartha Chaudhuri, Daniel Ritchie, Jiajun Wu, Kai Xu, Hao Zhang

<details span> <summary>Abstract</summary> 3D models of objects and scenes are critical to many academic disciplines and industrial applications. Of particular interest is the emerging opportunity for 3D graphics to serve artificial intelligence: computer vision systems can benefit from synthetically-generated training data rendered from virtual 3D scenes, and robots can be trained to navigate in and interact with real-world environments by first acquiring skills in simulated ones.