Dreamguider

Abstract

Diffusion models have emerged as a formidable tool for training-free conditional generation. However, a key hurdle in inference-time guidance techniques is the need for compute-heavy backpropagation through the diffusion network for estimating the guidance direction. Moreover, these techniques often require handcrafted parameter tuning on a case-by-case basis. Although some recent works have introduced minimal compute methods for linear inverse problems, a generic lightweight guidance solution to both linear and non-linear guidance problems is still missing.

To this end, we propose Dreamguider, a method that enables inference-time guidance without the compute-heavy backpropagation through the diffusion network. The key idea is to approximate the guidance direction with respect to the current sample, thereby removing the backpropagation operation. Moreover, we propose an empirical guidance scale that works for a wide variety of tasks, hence removing the need for handcrafted parameter tuning. We further introduce an effective lightweight augmentation strategy that significantly boosts the performance during inference-time guidance. We present experiments using Dreamguider on multiple linear and non-linear tasks across multiple datasets and models to show the effectiveness of the proposed modules.

Functionality of Dreamguider

Table illustrating the capabilities of Dreamguider over existing methods performing inference-time guidance

Our Approach

An illustration of the difference between the existing method and our method. Existing works backpropagate through the diffusion network to perform guidance at each timestep, whereas we find the gradients with respect to the MMSE estimate and the predicted noise, thereby bypassing the expensive backpropagation operation.

Algorithm of dreamguider

Pseudo code of inference time sampling process in dreamgudier.

Qualitative results

Automatically estimated parameter values for different guidance scales

Figure illustrating automatically estimates guidance scale values across different tasks and different images. Please note that for the same tasks, different images have different estimated guidance scale values based on their difficulty.

Results on Natural Images

Qualitative comparisons for linear inverse problems on ImageNet.

Results on Faces

Qualitative comparisons for linear inverse problems on the FFHQ dataset.

Dreamguider: Tuning Free Conditional Generation

Applications An illustration of various applications of our method