TY - JOUR
T1 - Progressive normalizing flow with learnable spectrum transform for style transfer
AU - He, Zixuan
AU - Huang, Guoheng
AU - Yuan, Xiaochen
AU - Zhong, Guo
AU - Pun, Chi Man
AU - Zeng, Yiwen
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2024/1/25
Y1 - 2024/1/25
N2 - Most current style transfer models are designed as encoder–decoder structures. Some encoding operations, such as downsampling and pooling, cause a loss of image details. If the encoder and decoder are not compatible, it can also introduce distortion. Reversible neural networks have demonstrated their superior power in lossless projection. However, since the inputs and outputs of neural flows are holistic features, merely the high-level features can be utilized for image generation through reverse inference. These high-level features emphasize the image style more, leading to the generated results easily losing content details and producing abstract colors. To address the above issues, we propose LSTFlow, the first progressive reversible neural network capable of feature decomposition. First, LSTFlow incorporates our proposed reversible Learnable Spectrum Transform (LST), which can dynamically decompose the feature into feature spectrum and recover them losslessly. LSTFlow can retain more details by enabling multi-level features to be fused in backward inference. Second, we propose a Progressive Flow Stylization Strategy (PFSS) to balance the model's emphasis between content and style and enhance the color perception. Forward inference based PFSS is carried out progressively, while the backward inference focuses on progressive generation. To demonstrate the effectiveness of our proposed method, we conducted comparative experiments with seven other state-of-the-art algorithms. The stylized effects are evaluated in terms of visual effects and quantitative indicators. The experiments show that the lightest LSTFlow performs the best in SSIM, Color Entropy, Color Uniformity and FID indicators and outperforms state-of-the-art methods.
AB - Most current style transfer models are designed as encoder–decoder structures. Some encoding operations, such as downsampling and pooling, cause a loss of image details. If the encoder and decoder are not compatible, it can also introduce distortion. Reversible neural networks have demonstrated their superior power in lossless projection. However, since the inputs and outputs of neural flows are holistic features, merely the high-level features can be utilized for image generation through reverse inference. These high-level features emphasize the image style more, leading to the generated results easily losing content details and producing abstract colors. To address the above issues, we propose LSTFlow, the first progressive reversible neural network capable of feature decomposition. First, LSTFlow incorporates our proposed reversible Learnable Spectrum Transform (LST), which can dynamically decompose the feature into feature spectrum and recover them losslessly. LSTFlow can retain more details by enabling multi-level features to be fused in backward inference. Second, we propose a Progressive Flow Stylization Strategy (PFSS) to balance the model's emphasis between content and style and enhance the color perception. Forward inference based PFSS is carried out progressively, while the backward inference focuses on progressive generation. To demonstrate the effectiveness of our proposed method, we conducted comparative experiments with seven other state-of-the-art algorithms. The stylized effects are evaluated in terms of visual effects and quantitative indicators. The experiments show that the lightest LSTFlow performs the best in SSIM, Color Entropy, Color Uniformity and FID indicators and outperforms state-of-the-art methods.
KW - Feature decomposition
KW - Feature spectrum
KW - Neural flow
KW - Progressive stylization
KW - Reversible neural network
KW - Style transfer
UR - http://www.scopus.com/inward/record.url?scp=85180752619&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2023.111277
DO - 10.1016/j.knosys.2023.111277
M3 - Article
AN - SCOPUS:85180752619
SN - 0950-7051
VL - 284
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 111277
ER -