OmniPSD: Layered PSD Generation with Diffusion Transformer

Cheng Liu1 , Yiren Song1 , Haofan Wang2 , Mike Zheng Shou1

1 Show Lab, National University of Singapore   |   2 Lovart AI

Given either a flattened poster image, OmniPSD produces layered PSD files with transparent alpha channels, separating text, foreground elements, and background into clean RGBA layers that can be directly edited in tools. An online interactive demo is available at Lovart AI.

OmniPSD method overview

OmniPSD: a unified Diffusion-Transformer with a shared RGBA-VAE supports text-to-PSD layered generation and image-to-PSD decomposition, producing fully editable PSD layers with transparent alpha channels.

OmniPSD method overview

Image-to-PSD reconstruction uses a diffusion-based flow-matching model to iteratively decompose a flattened poster into editable layers.

OmniPSD method overview

Text-to-PSD synthesis leverages a 2×2 in-context RGBA grid and hierarchical captions within a unified diffusion transformer to jointly generate editable layers.

OmniPSD method overview

OmniPSD delivers sharper reconstructions and cleaner separated text, foreground, and background layers, while better preserving the original layout and colors.

OmniPSD method overview

We produce more coherent foreground/background layering and better aligned editable text layers for PSD export.

BibTeX

@article{Liu2025OmniPSD, title = {OmniPSD: Layered PSD Generation with Diffusion Transformer}, author = {Liu, Cheng and Song, Yiren and Wang, Haofan and Shou, Mike Zheng}, journal = {arXiv preprint arXiv:2512.09247}, year = {2025}, archivePrefix = {arXiv}, eprint = {2512.09247}, primaryClass = {cs.CV}, doi = {10.48550/arXiv.2512.09247}, url = {https://arxiv.org/abs/2512.09247} }