Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation

Jiaying Lan, Lianglun Cheng, Guoheng Huang, Chi Man Pun, Xiaochen Yuan, Shangyu Lai, Hong Rui Liu, Wing Kuen Ling

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multimodal image-to-image translation has received great attention due to its flexibility and practicality. The existing methods lack the generality of effective style representation, and cannot capture different levels of stylistic semantic information from cross-domain images. Besides, they ignore the parallelism for cross-domain image generation, and their generator can only be responsible for specific domains. To address these issues, we propose a novel Single Cross-domain Semantic Guidance Network (SCSG-Net) for coarse-to-fine semantically controllable multimodal image translation. Images from different domains are mapped to a unified visual semantic latent space by a dual sparse feature pyramid encoder, and then the generative module generates the result images by extracting semantic style representation from the input images in a self-supervised manner guided by adaptive discrimination. Especially, our SCSG-Net meets the needs of users in different styles as well as diverse scenarios. Extensive experiments on different benchmark datasets show that our method can outperform other state-of-the-art methods both quantitatively and qualitatively.

Original languageEnglish
Title of host publicationMultiMedia Modeling - 29th International Conference, MMM 2023, Proceedings
EditorsDuc-Tien Dang-Nguyen, Cathal Gurrin, Alan F. Smeaton, Martha Larson, Stevan Rudinac, Minh-Son Dao, Christoph Trattner, Phoebe Chen
PublisherSpringer Science and Business Media Deutschland GmbH
Pages165-177
Number of pages13
ISBN (Print)9783031270765
DOIs
Publication statusPublished - 2023
Event29th International Conference on MultiMedia Modeling, MMM 2023 - Bergen, Norway
Duration: 9 Jan 202312 Jan 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13833 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on MultiMedia Modeling, MMM 2023
Country/TerritoryNorway
CityBergen
Period9/01/2312/01/23

Keywords

  • Multimodal image translation
  • Semantic guidance
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation'. Together they form a unique fingerprint.

Cite this