用于 :hugs: Diffusers 的 IPAdapter 实现

这是 Huggingface Diffusers 的 IPAdapter 模型的另一种实现。与官方仓库的主要区别如下：

支持多个输入图像（而不仅仅是一个）
支持输入图像的权重设置
支持负面输入图像（发送噪声负面图像可能会产生更好的结果）
代码更简短，更易于维护
简化的工作流程，只有一个主要类（IPAdapter），而不是每个模型（base、sdxl、plus 等）一个类

我还开发了一个 ComfyUI 扩展，支持相同的功能以及更多特性。

安装

它可以在任何标准的 diffusers 环境中运行，不需要任何特定的库。为了完整起见，我包含了一个 requirements.txt 文件，您可以用它来创建一个基础的 Python 环境（适用于 CUDA）。

示例涵盖了大多数使用场景。它们应该是自解释的。将 config.py.sample 重命名为 config.py 并填写您的模型路径以执行所有示例。

IPAdapter 模型可以在 Huggingface 上找到。

请记住，SDXL vit-h 模型需要 SD1.5 图像编码器（即使基础模型是 SDXL）。

示例

一个基本示例如下：

from diffusers import StableDiffusionPipeline, DDIMScheduler

import torch
from PIL import Image

import config as cfg
from ip_adapter.ip_adapter import IPAdapter

device = "cuda"

pipe = StableDiffusionPipeline.from_single_file("path/to/model", torch_dtype=torch.float16)
pipe.safety_checker = None
pipe.feature_extractor = None
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
pipe.to(device)

image1 = Image.open("reference_image.jpg")

ip_adapter = IPAdapter(pipe, "ipdapter/model/path", "image/encoder/path", device=device)

prompt_embeds, negative_prompt_embeds = ip_adapter.get_prompt_embeds(
    image1,
    prompt="positive prompt",
    negative_prompt="blurry,",
)

generator = torch.Generator().manual_seed(1)

image = pipe(
    prompt_embeds=prompt_embeds,
    negative_prompt_embeds=negative_prompt_embeds,
    num_inference_steps=30,
    guidance_scale=6.0,
    generator=generator,
).images[0]
image.save("image.webp", lossless=True, quality=100)

建议/推荐

负面提示很重要，确保至少添加"blurry"，可能还需要更多（如"low quality"等）。

IPAdapter 模型倾向于过度曝光图像，增加步数并降低引导比例。

发送随机噪声的负面图像通常会有所帮助。查看以下示例

参考（无负面）	基本噪声	曼德勃罗噪声
<img src="https://yellow-cdn.veclightyear.com/ab5030c0/52ed4a3f-98e0-46cf-8fd9-ddccc4eea3c1.webp" width="256" alt="无负面" />	<img src="https://yellow-cdn.veclightyear.com/ab5030c0/4e5c7997-a862-4881-b2dc-d0914c593b14.webp" width="256" alt="无负面" />	<img src="https://yellow-cdn.veclightyear.com/ab5030c0/df266da3-fada-4522-9fc7-8066f258b266.webp" width="256" alt="无负面" />

注：页面顶部的封面图是使用曼德勃罗噪声生成的

您可以尝试其他类型的噪声和负面图像。如果您发现一个很酷的方法，请告诉我。

支持图像权重设置，但我确信还有更好的方法来设置嵌入的权重。这里欢迎提出建议。

请记住，图像、负面图像和权重必须是相同大小的列表。