FOPS-V: Feature-Aware Optimization and Parallel Scale Fusion for 3D Human Reconstruction in Video

Yang Huang, Guoheng Huang, Lianglun Cheng, Yejing Huo, Xuhang Chen, Xiaochen Yuan, Guo Zhong, Chi Man Pun

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Video-based 3D human reconstruction, a fundamental task in computer vision, aims to accurately estimate the 3D pose and shape of the human body from video sequences. While recent methods leverage spatial and temporal feature extraction techniques, many remain limited by single-scale processing, hindering their performance in complex scenes. Additionally, challenges such as occlusion and complex poses often lead to inaccurate reconstructions. To address these limitations, we propose FOPS-V: Feature-aware Optimization and Parallel Scale Fusion for 3D Human Reconstruction in Video. Our approach comprises three key components: a Feature-Aware Optimization (FAO) block, a Parallel Scale-Aware Attention (PSAA) block, and a Normalized Feature-Aware Representation (NFAR) guided by Feature-Response Layer Normalization (FRLN). The FAO block enhances feature extraction by optimizing joint and mesh vertex representations through the fusion of image features and learned query vectors. The PSAA block performs subscale feature extraction for joint and mesh vertices and fuses multiscale feature information to improve pose and shape representations. Guided by FRLN, the NFAR addresses instability caused by variations in feature statistics within the FAO and PSAA blocks. This normalization, with an adaptable threshold, enhances robustness to noisy or outlier data, preventing performance degradation. Extensive evaluations on the 3DPW, MPI-INF-3DHP, and Human3.6M datasets demonstrate that FOPS-V outperforms state-of-the-art methods, highlighting its effectiveness for 3D human reconstruction in video.

Original languageEnglish
Title of host publicationNeural Information Processing - 31st International Conference, ICONIP 2024, Proceedings
EditorsMufti Mahmud, Maryam Doborjeh, Kevin Wong, Andrew Chi Sing Leung, Zohreh Doborjeh, M. Tanveer
PublisherSpringer Science and Business Media Deutschland GmbH
Pages180-194
Number of pages15
ISBN (Print)9789819665983
DOIs
Publication statusPublished - 2025
Event31st International Conference on Neural Information Processing, ICONIP 2024 - Auckland, New Zealand
Duration: 2 Dec 20246 Dec 2024

Publication series

NameLecture Notes in Computer Science
Volume15293 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Neural Information Processing, ICONIP 2024
Country/TerritoryNew Zealand
CityAuckland
Period2/12/246/12/24

Keywords

  • 3D human pose estimation
  • Feature awareness
  • Mesh Construction

Fingerprint

Dive into the research topics of 'FOPS-V: Feature-Aware Optimization and Parallel Scale Fusion for 3D Human Reconstruction in Video'. Together they form a unique fingerprint.

Cite this