Single image super-resolution

Generator architecture

screen reader text
Discriminator architecture.
screen reader text
General framework of a GAN. Generator and Discriminator compete and make each other more accurate.

This work demonstrated the viability of ESRGAN (link) in SD to 8K HDR image upscaling task. Taking into account that 8K images have sizes of around 100 MB, we devise a new data preprocessing scheme to create pairs of LR and HR images. Our first results demonstrated that while the generated images are better than the classically interpolated versions, it still lacks a lot of features such as colors and facial features. Therefore, we began training another model which includes simple data augmentation techniques and longer warm up phase. After merely 20 epochs, one can already see the major benefits of these changes. Thus, our latest model achieved over 35 dB average PSNR on the test set and we visually demonstrated that it indeed produces better images. As a future direction, one can consider adding another data augmentation method called CutBlur and perhaps even train 3 different GANs.

Dias Azhigulov
Dias Azhigulov
Master student in Electrical and Computer Engineering

I find joy in learning about computers & related technologies both on software and hardware level.