K efektivitě manuální a poloautomatické excerpce neologismů
Datum publikování: 5.2019
Autor: Sláma, Jakub
Klíčová slova: data collection, manual detection of neologisms, neologisms, Python, semi-automatic detection of neologisms, manuální excerpce neologismů, neologismy, poloautomatická excerpce neologismů, sběr dat
Abstrakt: The paper presents a simple semi-automatic neologism detection procedure: a trivial Python script processes a text file, making use of a Czech morphological tagger, and extracts all words unrecognized by the tagger as potential neologisms. The list of these candidates has to be checked by a human (hence the label semi-automatic). This method was applied to a set of texts that were also analyzed in a more traditional way, by the “reading and marking” technique (i.e. the current practice). The comparison of the two methods has revealed that the semi-automatic procedure clearly outperforms the current practice both in speed and in efficiency.
Rubrika: Hlavní články
Rozsah stran: s. 64–75
Status recenzování: recenzovaný článek
Licence: CC BY 4.0
Citace (ISO 690)
SLÁMA, Jakub. K efektivitě manuální a poloautomatické excerpce neologismů. Naše řeč. Praha: Česká akademie císaře Františka Josefa pro vědy, slovesnost a umění, 05.2019, 102(1-2), 64-75. ISSN 0027-8203.
Dostupné také z:
http://asjournals.lib.cas.cz/naserec/article/uuid:f02ee5c8-f959-440d-8733-2efc788a69d0