To solve the problem of automatic stylistic classification of texts, methods of discriminant analysis were applied. The trigram index (TI), the bigram index (BI), their ratio (TI/BI), the text compressibility index (Deflate), as well as information indexes: the ratio of order and chaos in the system – the so-called R-function (Rf), and the development function (Df) were considered as possible classification factors. Optimal combinations of indexes for solving this problem are found. It was possible to achieve a significant improvement in the quality of text classification and reducing the number of used indexes in comparison with previous works.
Translated title of the contributionIMPROVING THE QUALITY OF STYLISTIC CLASSIFICATION OF RUSSIAN-LANGUAGE TEXTS BASED ON STATISTICAL INDEXES: book chapter
Original languageRussian
Title of host publicationERGO... ПРОБЛЕМЫ МЕТОДОЛОГИИ МЕЖДИСЦИПЛИНАРНЫХ ИССЛЕДОВАНИЙ И КОМПЛЕКСНОГО ОБЕСПЕЧЕНИЯ НАУЧНО-ИССЛЕДОВАТЕЛЬСКОЙ ДЕЯТЕЛЬНОСТИ
Subtitle of host publicationсборник статей
EditorsП.П. Трескова
Place of PublicationЕкатеринбург
PublisherООО "Издательство УМЦ УПИ"
Pages74-84
Number of pages11
ISBN (Print)978-5-8295-0848-7
DOIs
Publication statusPublished - 2022

    GRNTI

  • 20.19.27

ID: 42057884