Open3D-ML

Open3D-ML 项目介绍

Open3D-ML 是 Open3D 专门为 3D 机器学习任务开发的一个扩展库。它在 Open3D 核心库之上，增加了用于 3D 数据处理的机器学习工具。该项目主要关注语义点云分割等应用，同时提供预训练模型，可以用于常见任务，并包含训练的操作流程。Open3D-ML 支持 TensorFlow 和 PyTorch，以便与现有项目轻松整合，并且提供了与机器学习框架无关的数据可视化功能。

安装

用户指南

Open3D-ML 已集成在 Open3D v0.11+ 的 Python 分发版中，并与以下版本的机器学习框架兼容：

PyTorch 2.0.*
TensorFlow 2.13.*（macOS，Linux 见下文）
CUDA 10.1, 11.*（在 GNU/Linux x86_64 平台上可选）

使用以下命令可以安装 Open3D：

# 确保 pip 是最新版本
pip install --upgrade pip
# 安装 open3d
pip install open3d

也可以根据需要使用以下命令安装与之配套的 PyTorch 或 TensorFlow：

# 安装配套版本的 TensorFlow
pip install -r requirements-tensorflow.txt
# 安装配套版本的 PyTorch
pip install -r requirements-torch.txt

通过以下命令测试安装是否成功：

# 使用 PyTorch
$ python -c "import open3d.ml.torch as ml3d"
# 使用 TensorFlow
$ python -c "import open3d.ml.tf as ml3d"

对于需要使用不同版本的机器学习框架或 CUDA 的情况，推荐从源码构建 Open3D，或者在 Docker 中构建 Open3D。

开始使用

读取数据集

Open3D-ML 提供了读取常见数据集的类。以下示例展示了如何读取 SemanticKITTI 数据集并进行可视化：

import open3d.ml.torch as ml3d  # 或者使用 open3d.ml.tf

dataset = ml3d.datasets.SemanticKITTI(dataset_path='/path/to/SemanticKITTI/')
all_split = dataset.get_split('all')

print(all_split.get_attr(0))
print(all_split.get_data(0)['point'].shape)

vis = ml3d.vis.Visualizer()
vis.visualize_dataset(dataset, 'all', indices=range(100))

加载配置文件

模型、数据集和流程的配置存储在 ml3d/configs 目录中。用户还可以通过创建自己的 yaml 文件来记录自定义配置。以下为加载和构建配置的示例：

import open3d.ml as _ml3d
import open3d.ml.torch as ml3d # 或者使用 open3d.ml.tf as ml3d

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

Pipeline = _ml3d.utils.get_module("pipeline", cfg.pipeline.name, framework)
Model = _ml3d.utils.get_module("model", cfg.model.name, framework)
Dataset = _ml3d.utils.get_module("dataset", cfg.dataset.name)

cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = Dataset(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
model = Model(**cfg.model)
pipeline = Pipeline(model, dataset, **cfg.pipeline)

语义分割

使用预训练模型进行语义分割

在之前的基础上，可以使用一个预训练模型进行语义分割，并在数据集中的点云上运行模型。以下是怎么样实例化一个预训练模型以及在点云上运行的示例：

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.RandLANet(**cfg.model)
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.SemanticSegmentation(model, dataset=dataset, device="gpu", **cfg.pipeline)

# 下载模型权重
ckpt_folder = "./logs/"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "randlanet_semantickitti_202201071330utc.pth"
randlanet_url = "https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(randlanet_url, ckpt_path)
    os.system(cmd)

# 加载参数
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# 在单个示例上进行推断
result = pipeline.run_inference(data)

# 在测试集上评估性能
pipeline.run_test()

训练模型进行语义分割

与推断类似，Open3D-ML 提供了在数据集上训练模型的接口：

dataset = ml3d.datasets.SemanticKITTI(dataset_path='/path/to/SemanticKITTI/', use_cache=True)

model = RandLANet()

pipeline = SemanticSegmentation(model=model, dataset=dataset, max_epoch=100)

# 在控制台中打印训练进度。
pipeline.run_train()

3D 目标检测

使用预训练模型进行 3D 目标检测

3D 目标检测模型与语义分割模型类似，可以通过实例化一个预训练模型来进行目标检测。以下为示例代码：

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/pointpillars_kitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.PointPillars(**cfg.model)
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = ml3d.datasets.KITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device="gpu", **cfg.pipeline)

# 下载模型权重
ckpt_folder = "./logs/"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "pointpillars_kitti_202012221652utc.pth"
pointpillar_url = "https://storage.googleapis.com/open3d-releases/model-zoo/pointpillars_kitti_202012221652utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(pointpillar_url, ckpt_path)
    os.system(cmd)

# 加载参数
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# 在单个示例上进行推断
result = pipeline.run_inference(data)

# 在测试集上评估性能
pipeline.run_test()

训练模型进行 3D 目标检测

除了推断，Open3D-ML 还允许用户训练 3D 目标检测模型：

dataset = ml3d.datasets.KITTI(dataset_path='/path/to/KITTI/', use_cache=True)

model = PointPillars()

pipeline = ObjectDetection(model=model, dataset=dataset, max_epoch=100)

# 打印训练进度
pipeline.run_train()

使用预定义脚本

Open3D-ML 提供的 scripts/run_pipeline.py 脚本为在数据集上训练和评估模型提供了简化的界面。具体命令如下：

python scripts/run_pipeline.py {tf/torch} -c <path-to-config> --pipeline {SemanticSegmentation/ObjectDetection} --<extra args>

脚本适用于语义分割和目标检测任务。脚本中 pipeline 参数必须指定为 SemanticSegmentation 或 ObjectDetection。

项目结构

Open3D-ML 的核心模块在 ml3d 子目录下，已经整合在 Open3D 中的 ml 命名空间中。除此之外，目录中还包含 examples 和 scripts，为用户提供了入门脚本及示例程序：

├─ docs                   # 文档相关的 Markdown 和 rst 文件
├─ examples               # 示例脚本和 notebooks
├─ ml3d                   # 包含主要的实现，集成在 open3d 中
     ├─ configs           # 相关配置文件
     ├─ datasets          # 通用数据集代码
     ├─ metrics           # 评估模型效果的指标
     ├─ utils             # 与框架无关的工具
     ├─ vis               # 视觉化函数
     ├─ tf                # TensorFlow 的代码实现
     ├─ torch             # PyTorch 的代码实现，细分为数据加载器、模型、模块、流程和工具
├─ scripts                # 示例训练脚本与数据集下载脚本