Our work Unite and Conquer was accepted at CVPR 2023. This work enables plug-and-play multimodal generation using diffusion models.