dc.contributor.author
Ayres, Gavin
dc.contributor.author
Munsamy, Geraldene
dc.contributor.author
Heinzinger, Michael
dc.contributor.author
Ferruz Capapey, Noelia, 1988-
dc.contributor.author
Yang, Kevin
dc.contributor.author
Bergman, Bastiaan
dc.contributor.author
Lorenz, Philipp
dc.date.accessioned
2025-09-06T10:48:07Z
dc.date.available
2025-09-06T10:48:07Z
dc.date.issued
2025-09-05T06:25:04Z
dc.date.issued
2025-09-05T06:25:04Z
dc.identifier
Ayres G, Munsamy G, Heinzinger M, Ferruz N, Yang K, Bergman B, et al. Annotating the microbial dark matter with HiFi-NN. iScience. 2025 Apr 18;28(6):112480. DOI: 10.1016/j.isci.2025.112480
dc.identifier
http://hdl.handle.net/10230/71117
dc.identifier
http://dx.doi.org/10.1016/j.isci.2025.112480
dc.identifier.uri
https://hdl.handle.net/10230/71117
dc.description.abstract
The accurate computational annotation of protein sequences with enzymatic function remains a fundamental challenge in bioinformatics. Here, we present HiFi-NN (Hierarchically-Finetuned Nearest Neighbor search) which annotates protein sequences to the 4th level of Enzyme Commission (EC) number with greater precision and recall than state-of-the-art deep learning methods. Furthermore, we show that this method can correctly identify the EC number of a given sequence to lower identities than BLASTp. We show that performance can be improved by increasing the diversity of the lookup set in both sequence space and the environment the sequence has been sampled from. We proceed to show that we can correct specific mis-annotations in the BRENDA enzymes database reproducing results found by others. Finally, we use HiFi-NN to annotate functional dark-matter protein sequences from NMPFamDB. Our findings pave the way for more accurate functional annotation in silico, especially for proteins from distant sequence space.
dc.format
application/pdf
dc.format
application/pdf
dc.relation
iScience. 2025 Apr 18;28(6):112480
dc.rights
© 2025 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
dc.rights
http://creativecommons.org/licenses/by/4.0/
dc.rights
info:eu-repo/semantics/openAccess
dc.subject
Computer science
dc.title
Annotating the microbial dark matter with HiFi-NN
dc.type
info:eu-repo/semantics/article
dc.type
info:eu-repo/semantics/publishedVersion