dc.contributor |
Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial |
dc.contributor |
Moreno-Noguer, Francesc |
dc.contributor.author |
Rubio Romano, Antonio |
dc.date |
2015-10 |
dc.identifier.uri |
http://hdl.handle.net/2117/82971 |
dc.language.iso |
eng |
dc.publisher |
Universitat Politècnica de Catalunya |
dc.rights |
Attribution-NonCommercial-NoDerivs 3.0 Spain |
dc.rights |
info:eu-repo/semantics/openAccess |
dc.rights |
http://creativecommons.org/licenses/by-nc-nd/3.0/es/ |
dc.subject |
Àrees temàtiques de la UPC::Informàtica |
dc.subject |
Neural networks (Computer science) |
dc.subject |
Three-dimensional display systems |
dc.subject |
Xarxes neuronals (Informàtica) |
dc.subject |
Visualització tridimensional (Informàtica) |
dc.title |
3D Pose Estimation Using Convolutional Neural Networks |
dc.type |
info:eu-repo/semantics/masterThesis |
dc.description.abstract |
The present Master Thesis describes a new Pose Estimation method based on Convolutional
Neural Networks (CNN). This method divides the three-dimensional space in several regions
and, given an input image, returns the region where the camera is located.
The first step is to create synthetic images of the object simulating a camera located at
di↵erent points around it. The CNN is pre-trained with these thousands of synthetic images
of the object model.
Then, we compute the pose of the object in hundreds of real images, and apply transfer
learning with these labeled real images over the existing CNN, in order to refine the weights
of the neurons and improve the network behaviour against real input images.
Along with this deep learning approach, other techniques have been used trying to improve
the quality of the results, such as the classical sliding window or a more recent class-generic
object detector called objectness.
It is tested with a 2D-model in order to ease the labeling process of the real images.
This document outlines all the steps followed to create and test the method, and finally
compares it against a state-of-the-art method at di↵erent scales and levels of blurring. |