Recovering accuracy methods for scalable consistency library

dc.contributor.author
Lladós Segura, Jordi
dc.contributor.author
Guirado Fernández, Fernando
dc.contributor.author
Cores Prado, Fernando
dc.contributor.author
Lérida Monsó, Josep Lluís
dc.contributor.author
Notredame, Cedric
dc.date.accessioned
2024-12-05T22:06:36Z
dc.date.available
2024-12-05T22:06:36Z
dc.date.issued
2015-09-23T14:02:33Z
dc.date.issued
2015-09-23T14:02:33Z
dc.date.issued
2015-05-01
dc.date.issued
2015-09-23T14:02:33Z
dc.identifier
https://doi.org/10.1007/s11227-014-1362-z
dc.identifier
0920-8542
dc.identifier
http://hdl.handle.net/10459.1/48751
dc.identifier.uri
http://hdl.handle.net/10459.1/48751
dc.description.abstract
Multiple sequence alignment (MSA) is crucial for high-throughput next generation sequencing applications. Large-scale alignments with thousands of sequences are necessary for these applications. However, the quality of the alignment of current MSA tools decreases sharply when the number of sequences grows to several thousand. This accuracy degradation can be mitigated using global consistency information as in the T-Coffee MSA-Tool, which implements a consistency library. However, consistency-based methods do not scale well because of the computational resources required to calculate and store the consistency information, which grows quadratically. In this paper, we propose an alternative method for building the consistency-library. To allow unlimited scalability, consistency information must be discarded to avoid exceeding the environment memory. Our first approach deals with the memory limitation by identifying the most important entries, which provide better consistency. This method is able to achieve scalability, although there is a negative impact on accuracy. The second proposal, aims to reduce this degradation of accuracy, with three different methods presented to attain a better alignment.
dc.description.abstract
This work has been supported by the Government of Spain TIN2011-28689-C02-02. Cedric Notredame is funded by the Plan Nacional BFU2011-28575 and The Quantomics project (KBBE- 2A-222664).
dc.format
application/pdf
dc.language
eng
dc.publisher
Springer Verlag
dc.relation
info:eu-repo/grantAgreement/MICINN//TIN2011-28689-C02-02/ES/EJECUCION EFICIENTE DE APLICACIONES MULTIDISCIPLINARES: NUEVOS DESAFIOS EN LA ERA MULTI%2FMANY CORE/
dc.relation
info:eu-repo/grantAgreement/MICINN//BFU2011-28575/ES/NGS-COFFEE: PRODUCCION DE ALINEAMIENTOS GENOMICOS MULTIPLES MEDIANTE EL ENRIQUECIMIENTO DE SECUENCIAS DE ADN CON INFORMACION EXPERIMENTAL PROVENIENTE DE CHIP-SEQ Y RNA-SEQ/
dc.relation
Reproducció del document publicat a: https://doi.org/10.1007/s11227-014-1362-z
dc.relation
Journal of Supercomputing, 2015, vol. 71, núm. 5, p. 1833-1845
dc.rights
cc-by (c) Lladós Segura, Jordi et al., 2015
dc.rights
info:eu-repo/semantics/openAccess
dc.rights
http://creativecommons.org/licenses/by/3.0/es
dc.subject
Large-Scale Alignments
dc.subject
Scalability
dc.subject
Consistency
dc.subject
T-Coffee
dc.subject
Multiple Sequence Alignment
dc.subject
Llenguatges de programació
dc.subject
Informàtica
dc.subject
Arquitectures de xarxes d'ordinadors
dc.subject
Programming languages (Electronic computers)
dc.subject
Computer science
dc.subject
Computer network architectures
dc.title
Recovering accuracy methods for scalable consistency library
dc.type
info:eu-repo/semantics/article
dc.type
publishedVersion


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)