AIGC – Page 2 – Robot 9

May 25, 2024 接上一篇 ControlNet 算法原理与代码解释，本文主要整理一些类似的算法，即针对 Diffusion 的条件控制： 1. Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation Project：https://pnp-diffusion.github.io/ Paper：https://arxiv.org/abs/2211.12572 Code：https://github.com/MichalGeyer/plug-and-play 文本驱动的 img2img 生成，其条件控制是输入图的空间结构，文章主要基于对 UNet 各层 feature map 的分析：粗粒度的 decoder 层特征主要由图像的的语义结构决定 self-attention map 与图像的空间布局对齐（相似的区域对应相同的颜色）因此文章提出的方案是：首先对输入图进行 DDIM Inversion，提取 spatial (ResNet) feature 和 self-attention maps，然后替换到结果图的 denoising 生成过程，整个方案无需训练，可以看作是对输入图 UNet feature 的“即插即用” 2. Sketch-Guided Text-to-Image Diffusion Models Project：https://sketch-guided-diffusion.github.io/ Paper：https://arxiv.org/abs/2211.13752 线稿图引导的Continue reading “ControlNet 相关算法整理”