The International Conference on Computational Collective Intelligence 2021 (13th ICCCI 2021) is an international scientific conference devoted to research on collective computational intelligence. This year’s edition took place in hybrid mode on September 29 – October 1, 2021. on the Greek island of Rhodes. Among the speakers was Paweł Drozda , representing Literary Technology, who gave a lecture entitled Comprehensive Evaluation of Word Embeddings for Highly Inflectional Language . The authors of the text are also Krzysztof Sopyła and Juliusz Lewalski from the Literacka team. The lecture concerned the problem of word embedding, i.e. transforming text into numbers, for highly inflectional languages. The speech was held online on Thursday, September 30, 2021, from 17.00-18.30.
Word embedding is the conversion of text into numbers so that it can be understood by the machine. Computers learn language differently than humans do – they use word embedding techniques to do this. Word embedding allows AI to process natural language and find relationships and similarities between specific words.
During the speech, Paweł Drozda presented experiments aimed at choosing the best model of word embedding for languages with high inflection. The authors of the article specifically assessed the word embedding techniques available for the Polish language . Static embeddings such as Word2Vec, GloVe, fasttext and their training settings are included. In detail, the evaluation covered 121 different embedding models provided by IPI PAN, OPI, Kyubyong and Facebook. The presentation uses examples from the work of the Literacka research team on the intelligent BookScout.ai system for publishers and authors.