A combination of monocular depth estimation and depth completion for an enhanced RGB-D sensing

Martínez Lanza, Hugo; Martínez Lanza, Hugo

A combination of monocular depth estimation and depth completion for an enhanced RGB-D sensing

Autor/a

Martínez Lanza, Hugo

Altres autors/es

Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions

Vilaplana Besler, Verónica

Data de publicació

2024-09-13

Resum

Depth perception is the capability of a system to estimate the distance of objects in a 3D scene by analyzing one or more 2D images or video frames. Depth perception is critical for applications such as robotics, autonomous vehicles, and augmented reality. This thesis explores two primary approaches to depth estimation: Monocular Depth Estimation (MDE) and Depth Completion (DC). In both cases, the best performing methods leverage Deep Learning (DL) technology. Deep learning models are complex networks that learn independently without human intervention. MDE involves estimating depth from a single RGB image, most methods leverage scene knowledge to solve what is considered to be an ill-posed problem. This includes using geometric cues, such as the relative size and position of objects within the scene, as well as texture gradients that can suggest how surfaces are oriented. Additionally, these methods often rely on prior information about the typical sizes and shapes of objects. MDE can be categorized into relative depth estimation, where the focus is on the depth relationships within the scene, and absolute depth estimation, which aims to provide precise distance measurements from the camera to objects. This thesis evaluates state-of-the-art MDE methods for both relative and absolute depth estimation. For this purpose, certain ground truth depth is needed. Ground truth depth is obatined using RGB-D cameras and Structure from Motion (SfM) techniques. In addition, a depth completion method is studied and used as a base to build a DL model that leverages the performance of relative MDE methods to enhance sparse depth data from sensors like LiDAR. After evaluation, the results show how errors are reduced and highlights the potential of these methods to be applied in real-life scenarios.

Tipus de document

Master thesis

Llengua

Anglès

Matèries i paraules clau

Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal; Deep learning; Image processing; Computer vision; Depth; Monocular; DeepLearning; Completion; RGB-D; Lidar; Sensing; Aprenentatge profund; Imatges -- Processament; Visió per ordinador

Publicat per

Universitat Politècnica de Catalunya

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

S'autoritza la difusió de l'obra mitjançant la llicència Creative Commons o similar 'Reconeixement-NoComercial- SenseObraDerivada'

Open Access

Aquest element apareix en la col·lecció o col·leccions següent(s)

Treballs acadèmics [82539]

A combination of monocular depth estimation and depth completion for an enhanced RGB-D sensing

Autor/a

Altres autors/es

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Publicat per

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)