Author(s)

Jemin Kava, Dhwanil Raval, Japan M. Mavani, Prof. Dhaval R. Chandarana

  • Manuscript ID: 140082
  • Volume: 2
  • Issue: 1
  • Pages: 277–288

Subject Area: Computer Science

Abstract

Diffusion-based generative models have recently achieved remarkable success in single image super-resolution (ISR), producing high-fidelity, perceptually convincing high-resolution images from low-resolution inputs. These models, derived from Denoising Diffusion Probabilistic Models (DDPMs) and related formulations, address limitations of earlier CNN- and GAN-based approaches by generating rich textures and fine details aligned with human visual preferences. In this paper, we review the practical applications of diffusion models in ISR, focusing on their architectures, training strategies, and performance relative to traditional methods. We discuss prominent models such as DDPM-based upscalers, the SR3 approach for iterative refinement, and subsequent improvements. Experimental results from recent studies are analyzed, comparing diffusion-driven ISR with convolutional and GAN-based methods on standard benchmarks. Diffusion models consistently excel in perceptual quality (often achieving lower Fréchet Inception Distance and Learned Perceptual Similarity) and human evaluation fool rates, despite sometimes lower PSNR/SSIM metrics than optimized CNN/GAN methods. We include example comparisons, quantitative tables, and discuss the trade-offs in complexity and inference speed. The paper is organized as a standard academic report with sections covering the background, methodology, experimental evaluation, and a discussion on results and future directions in diffusion-based ISR.

Keywords