Título:
|
LaSTUS/TALN at Complex Word Identification (CWI) 2018 Shared Task
|
Autor/a:
|
AbuRa’ed, Ahmed; Saggion, Horacio
|
Abstract:
|
Comunicació presentada al 13th Workshop on Innovative Use of NLP for Building Educational Applications, celebrat el dia 5 de juny de 2018 a Nova Orleans, EUA. |
Abstract:
|
This paper presents the participation of the
LaSTUS/TALN team in the Complex Word
Identification (CWI) Shared Task 2018 in the
English monolingual track . The purpose of
the task was to determine if a word in a given
sentence can be judged as complex or not by
a certain target audience. For the English
track, task organizers provided a training and
a development datasets of 27,299 and 3,328
words respectively together with the sentence
in which each word occurs. The words were
judged as complex or not by 20 human evaluators;
ten of whom are natives. We submitted
two systems: one system modeled each word
to evaluate as a numeric vector populated with
a set of lexical, semantic and contextual features
while the other system relies on a word
embedding representation and a distance metric.
We trained two separate classifiers to automatically
decide if each word is complex or
not. We submitted six runs, two for each of the
three subsets of the English monolingual CWI
track. |
Abstract:
|
This work is (partly) supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502) and by the TUNER project (TIN2015-65308-C5-5-R, MINECO/FEDER, UE). |
Materia(s):
|
-Tractament del llenguatge natural (Informàtica) |
Derechos:
|
© ACL, Creative Commons Attribution 4.0 License
http://creativecommons.org/licenses/by/4.0/ |
Tipo de documento:
|
Objeto de conferencia Artículo - Versión publicada |
Editor:
|
ACL (Association for Computational Linguistics)
|
Compartir:
|
|