glpn-nyu

项目介绍：GLPN-NYU

项目背景

GLPN（Global-Local Path Networks，全球-局部路径网络）是一种用于单目深度估计的模型。单目深度估计是指通过单一图像来预测图像中物体到相机的距离，这对于自动驾驶、增强现实等领域有着重要的应用。该模型经过在NYUv2数据集上的调优，首次由Kim等人在论文《Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth》中介绍，并在GitHub上首次发布。

模型描述

GLPN使用了SegFormer作为基本框架，并在其基础上添加了一个轻量的头部结构用于深度估计。其核心思想是结合全局和局部的信息，以提高深度估计的精度。下图展示了该模型的结构：

模型结构

主要用途与局限性

用户可以使用该模型进行原始图像的单目深度估计。该模型在Hugging Face平台上用于多种任务的微调版本可以通过模型库进行查找。

如何使用

以下是使用该模型进行单目深度估计的简单示例代码：

from transformers import GLPNImageProcessor, GLPNForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

processor = GLPNImageProcessor.from_pretrained("vinvino02/glpn-nyu")
model = GLPNForDepthEstimation.from_pretrained("vinvino02/glpn-nyu")

# 将图像准备为模型的输入格式
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)
    predicted_depth = outputs.predicted_depth

# 插值调整到原始图像大小
prediction = torch.nn.functional.interpolate(
    predicted_depth.unsqueeze(1),
    size=image.size[::-1],
    mode="bicubic",
    align_corners=False,
)

# 可视化预测结果
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)

更多代码示例，请参考文档。

参考文献

如果需要引用这项工作，可以使用以下BibTeX条目：

@article{DBLP:journals/corr/abs-2201-07436,
  author    = {Doyeon Kim and
               Woonghyun Ga and
               Pyunghwan Ahn and
               Donggyu Joo and
               Sehwan Chun and
               Junmo Kim},
  title     = {Global-Local Path Networks for Monocular Depth Estimation with Vertical
               CutDepth},
  journal   = {CoRR},
  volume    = {abs/2201.07436},
  year      = {2022},
  url       = {https://arxiv.org/abs/2201.07436},
  eprinttype = {arXiv},
  eprint    = {2201.07436},
  timestamp = {Fri, 21 Jan 2022 13:57:15 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2201-07436.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}