Nithin Gopalakrishnan Nair

Ph.D. Student at Johns Hopkins University

SmallSize.png

I am a Research Engineer at Apple working on Multimodal LLMs. I was a PhD student at the Department of Electrical and Computer Engineering at VIU Lab, Johns Hopkins University, advised by Dr. Vishal M. Patel. Before joining JHU, I earned my dual degree (B.Tech & M.Tech) in Electrical Engineering from the Indian Institute of Technology, Madras. There, I worked with Dr. A.N. Rajagopalan at the IPCV Lab on image reconstruction.

I work on problems in Computer Vision. My research focuses on deep generative modeling with diffusion models, with an emphasis on plug-and-play architectures and efficient generation on low-compute devices.

I find the theory behind diffusion models fascinating—fun fact: the basics were proposed by Einstein! Over the past three years, I have worked on image, video, and 3D generation using diffusion models. I am always open to collaborations. If you are excited about diffusion models and want to work together, please feel free to reach out.

In my free time, I enjoy reading, running, and hiking.

News

Jun 2025 My internship work at Google, Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data, was accepted at ICCV 2025. Our work scales up 3D reconstruction using synthetic data and achieves state-of-the-art results.
Dec 2024 I started a new position as a Research Intern at Google.
Jul 2024 Our work MaxFusion was accepted at ECCV 2024. MaxFusion enables training-free multimodal spatial conditioning in text-to-image diffusion models.
Apr 2024 I started my summer internship at Nvidia Research.
Jul 2023 Our work Steered Diffusion was accepted at ICCV 2023. Steered Diffusion enables zero-shot conditional sampling using pre-trained unconditional diffusion models.

Selected Publications

  1. maxfusion.png
    Maxfusion: Plug&play multi-modal generation in text-to-image diffusion models
    Nithin Gopalakrishnan Nair, Jeya Maria Jose Valanarasu, and Vishal M Patel
    In European Conference on Computer Vision, 2024
  2. steered.png
    Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis
    Nithin Gopalakrishnan Nair, Anoop Cherian, Suhas Lohit, Ye Wang, Toshiaki Koike-Akino, Vishal M. Patel, and Tim K. Marks
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2023
  3. multi.png
    Unite and Conquer: Plug & Play Multi-Modal Synthesis Using Diffusion Models
    Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, and Vishal M Patel
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 2023
  4. binoising.png
    Diffuse-Denoise-Count: Accurate Crowd-Counting with Diffusion Models
    Yasiru Ranasinghe, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, and Vishal M Patel
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Mar 2024