A combination of monocular depth estimation and depth completion for an enhanced RGB-D sensing

dc.contributor
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions
dc.contributor
Vilaplana Besler, Verónica
dc.contributor.author
Martínez Lanza, Hugo
dc.date.issued
2024-09-13
dc.identifier
https://hdl.handle.net/2117/424603
dc.identifier
ETSETB-230.188735
dc.description.abstract
Depth perception is the capability of a system to estimate the distance of objects in a 3D scene by analyzing one or more 2D images or video frames. Depth perception is critical for applications such as robotics, autonomous vehicles, and augmented reality. This thesis explores two primary approaches to depth estimation: Monocular Depth Estimation (MDE) and Depth Completion (DC). In both cases, the best performing methods leverage Deep Learning (DL) technology. Deep learning models are complex networks that learn independently without human intervention. MDE involves estimating depth from a single RGB image, most methods leverage scene knowledge to solve what is considered to be an ill-posed problem. This includes using geometric cues, such as the relative size and position of objects within the scene, as well as texture gradients that can suggest how surfaces are oriented. Additionally, these methods often rely on prior information about the typical sizes and shapes of objects. MDE can be categorized into relative depth estimation, where the focus is on the depth relationships within the scene, and absolute depth estimation, which aims to provide precise distance measurements from the camera to objects. This thesis evaluates state-of-the-art MDE methods for both relative and absolute depth estimation. For this purpose, certain ground truth depth is needed. Ground truth depth is obatined using RGB-D cameras and Structure from Motion (SfM) techniques. In addition, a depth completion method is studied and used as a base to build a DL model that leverages the performance of relative MDE methods to enhance sparse depth data from sensors like LiDAR. After evaluation, the results show how errors are reduced and highlights the potential of these methods to be applied in real-life scenarios.
dc.format
application/pdf
dc.language
eng
dc.publisher
Universitat Politècnica de Catalunya
dc.rights
S'autoritza la difusió de l'obra mitjançant la llicència Creative Commons o similar 'Reconeixement-NoComercial- SenseObraDerivada'
dc.rights
Open Access
dc.subject
Àrees temàtiques de la UPC::Enginyeria de la telecomunicació::Processament del senyal
dc.subject
Deep learning
dc.subject
Image processing
dc.subject
Computer vision
dc.subject
Depth
dc.subject
Monocular
dc.subject
DeepLearning
dc.subject
Completion
dc.subject
RGB-D
dc.subject
Lidar
dc.subject
Sensing
dc.subject
Aprenentatge profund
dc.subject
Imatges -- Processament
dc.subject
Visió per ordinador
dc.title
A combination of monocular depth estimation and depth completion for an enhanced RGB-D sensing
dc.type
Master thesis


Fitxers en aquest element

FitxersGrandàriaFormatVisualització

No hi ha fitxers associats a aquest element.

Aquest element apareix en la col·lecció o col·leccions següent(s)