Monocular Depth Estimation with Adaptive Geometric Attention

Taher Naderi, Amir Sadovnik, Jason Hayward, Hairong Qi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

ingle image depth estimation is an ill-posed problem. That is, it is not mathematically possible to uniquely estimate the 3rd dimension (or depth) from a single 2D image. Hence, additional constraints need to be incorporated in order to regulate the solution space. In this paper, we explore the idea of constraining the model by taking advantage of the similarity between the RGB image and the corresponding depth map at the geometric edges of the 3D scene for more accurate depth estimation. We propose a general light-weight adaptive geometric attention module that uses the cross-correlation between the encoder and the decoder as a measure of this similarity. More precisely, we use the cosine similarity between the local embedded features in the encoder and the decoder at each spatial point. The proposed module along with the encoder-decoder network is trained in an end-to-end fashion and achieves superior and competitive performance in comparison with other state-of-the-art methods. In addition, adding our module to the base encoder-decoder model adds only an additional 0.03% (or 0.0003) parameters. Therefore, this module can be added to any base encoder-decoder network without changing its structure to address any task at hand.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages617-627
Number of pages11
ISBN (Electronic)9781665409155
DOIs
StatePublished - 2022
Externally publishedYes
Event22nd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022 - Waikoloa, United States
Duration: Jan 4 2022Jan 8 2022

Publication series

NameProceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022

Conference

Conference22nd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
Country/TerritoryUnited States
CityWaikoloa
Period01/4/2201/8/22

Keywords

  • 3D Computer Vision Deep Learning
  • Autoencoders
  • GANs
  • Grouping and Shape
  • Low-level and Physics-based Vision
  • Neural Generative Models
  • Scene Understanding
  • Segmentation
  • Vision for Robotics

Fingerprint

Dive into the research topics of 'Monocular Depth Estimation with Adaptive Geometric Attention'. Together they form a unique fingerprint.

Cite this