Triplane 遇见高斯散射:<br> 基于 Transformer 的快速且可泛化的单视图 3D 重建

TGS 基于混合 Triplane-Gaussian 3D 表示，能够在几秒钟内快速从单视图图像重建 3D 模型。

</div>

预览图

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers 的官方实现。

⭐️ 主要特点

一种新的混合 Triplane-Gaussian 3D 表示，利用了显式和隐式表示的优势。
能够在一秒内从单视图图像实现高质量 3D 重建。

🚩 最新消息

[2024/01/17] 我们发布了推理代码和预训练模型。
[2024/01/09] 我们在 HuggingFace Spaces 上发布了 Gradio 演示。

💻 示例

请在 Hugging Face Space 上的 Gradio 演示中在线试用我们的模型。

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/706da1b8-0b59-462a-b6e4-4a3316f9e909

Midjourney 生成图像的结果

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/d27451e7-d298-4b6b-9dfe-f7927847167d

真实世界拍摄图像的结果

https://github.com/VAST-AI-Research/TriplaneGaussian/assets/25632410/1efe39d4-fcf1-4904-bf80-097796ca18e8

🏁 快速开始

Colab 演示

在 Google Colab 中运行 TGS：

安装

Python >= 3.8
安装 PyTorch >= 1.12。我们在 torch1.12.1+cu113 上测试过，但其他版本应该也能正常工作。

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

安装 pointnet2_ops

cd tgs/models/snowflake/pointnet2_ops_lib && python setup.py install && cd -

安装 pytorch_scatter

pip install git+https://github.com/rusty1s/pytorch_scatter.git

安装 diff-gaussian-rasterization

pip install git+https://github.com/graphdeco-inria/diff-gaussian-rasterization.git

安装依赖项：

pip install -r requirements.txt

按照官方安装说明安装 PyTorch3D。

下载预训练模型

我们提供了一个可从 Hugging Face 下载的预训练检查点；下载检查点并将其放在 checkpoints 文件夹中。

from huggingface_hub import hf_hub_download
MODEL_CKPT_PATH = hf_hub_download(repo_id="VAST-AI/TriplaneGaussian", local_dir="./checkpoints", filename="model_lvis_rel.ckpt", repo_type="model")

请注意，该模型仅在 Objaverse-LVIS 数据集（约 45K 个 3D 模型）上训练。具有更多参数（例如，更深的层次、更多的特征通道）并在更大的数据集（例如，完整的 Objaverse 数据集）上训练的模型应该能够达到更强的性能，我们将在未来探索这一点。

推理

使用以下命令从单张图像重建 3DGS 模型。请将 data.image_list 更新为特定的图像路径列表。

python infer.py --config config.yaml data.image_list=[path/to/image1,] --image_preprocess --cam_dist ${cam_dist}
# 例如 python infer.py --config config.yaml data.image_list=[example_images/a_pikachu_with_smily_face.webp,] --image_preprocess

如果您想从输入图像中移除背景，可以在命令中开启 --image_preprocess 参数。在此之前，请下载 SAM 检查点并同样将其放在 checkpoints 文件夹中。

--cam_dist 用于设置 相机距离 参数，表示相机中心与场景中心之间的距离，默认为 1.9。

最后，脚本将保存一个视频（.mp4）和一个 3DGS（.ply）文件。.ply 文件的格式与 graphdeco-inria/gaussian-splatting 一致，使其与其他可视化工具（如 gsplat.js）兼容。

本地 Gradio 演示

我们的 Gradio 演示依赖于一个用于 3DGS 渲染的自定义 Gradio 组件。请先克隆此组件：

git clone https://github.com/dylanebert/gradio-splatting.git gradio_splatting

然后，您可以通过以下方式在本地启动 Gradio 演示：

python gradio_app.py

📝 一些提示

如果您发现结果不尽如人意，请尝试更改 相机距离 参数。例如，如果重建的 3D 模型看起来"扁平"，您可以考虑增加 相机距离，例如设置 --cam_dist 2.1。相反，如果 3D 模型看起来很厚，您可以减小它。这可能会改善结果。

致谢

本项目得到了清华大学和 VAST 的支持。
我们要感谢 @totoro97 的有益讨论。
我们的点云上采样模块修改自 SnowflakeNet。

引用

如果您觉得这项工作有帮助，请考虑引用我们的论文：

@article{zou2023triplane,
  title={Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers},
  author={Zou, Zi-Xin and Yu, Zhipeng and Guo, Yuan-Chen and Li, Yangguang and Liang, Ding and Cao, Yan-Pei and Zhang, Song-Hai},
  journal={arXiv preprint arXiv:2312.09147},
  year={2023}
}