跳至主導覽 跳至搜尋 跳過主要內容

Attention-CARU With Texture-Temporal Network for Video Depth Estimation

研究成果: Article同行評審

摘要

Video depth estimation has a wide range of applications, especially in the tasks of robot navigation and autonomous driving. RNN-based encoder-decoder architectures are the most commonly used methods for depth feature prediction, but recurrent operators have limitations of large-scale perspective from global information and also face the long-term dependency problem, which often leads to inaccurate prediction of object depth in complex scenes. To alleviate these issues, this work introduces an attention-based texture-temporal Content-Adaptive Recurrent Unit (CARU) for depth estimation. The CARU performs an enhanced RNN approach that covers the texture content in a video sequence and extracts the main features from temporal frames. Also, a combination of VGGreNet and Transformer refines the extraction of global and local features to facilitate depth map estimation. Besides, to improve the detection of moving objects in the depth map, an advanced loss function is introduced to further penalise the depth estimation error of moving objects. Experimental tests on the KITTI datasets show that this work achieves competitive performance in depth estimation, demonstrating good resultant capabilities and applicability to a variety of real-time scenarios.

原文English
頁(從 - 到)107994-108004
頁數11
期刊IEEE Access
13
DOIs
出版狀態Published - 2025

指紋

深入研究「Attention-CARU With Texture-Temporal Network for Video Depth Estimation」主題。共同形成了獨特的指紋。

引用此