Ponència presentada al First International Conference on Digital Access to Textual Cultural Heritage celebrada del 19 al 20 de maig de 2014 a Madrid
In this paper we present a crowdsourcing web-based application for extracting information from demographic handwritten document images. The proposed application integrates two points of view: the semantic information for demographic research, and the ground-truthing for document analysis research. Concretely, the application has the contents view, where the information is recorded into forms, and the labeling view, with the word labels for evaluating document analysis techniques. The crowdsourcing architecture allows to accelerate the information extraction (many users can work simultaneously), validate the information, and easily provide feedback to the users. We finally show how the proposed application can be extended to other kind of demographic historical manuscripts.
English
Crowdsourcing; Ground-truth generation; Historical Documents; Document Image Analysis
Ministerio de Economía y Competitividad TIN2012-37475-C02-02
European Commission 269796
Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (DATeCH) ; 2014, p. 103-108
open access
Aquest material està protegit per drets d'autor i/o drets afins. Podeu utilitzar aquest material en funció del que permet la legislació de drets d'autor i drets afins d'aplicació al vostre cas. Per a d'altres usos heu d'obtenir permís del(s) titular(s) de drets.
https://rightsstatements.org/vocab/InC/1.0/