Basit öğe kaydını göster

dc.contributor.authorKayalı, Nihal Zuhal
dc.contributor.authorOmurca, Sevinç İlhan
dc.date.accessioned2024-11-13T18:00:04Z
dc.date.available2024-11-13T18:00:04Z
dc.date.issued2024en_US
dc.identifier.citationN. Z. Kayalı., S. İlhan Omurca. (2024). Hybrid Tokenization Strategy for Turkish Abstractive Text Summarization. 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP) 11, 1-10 s.en_US
dc.identifier.urihttps://hdl.handle.net/20.500.12846/1398
dc.description.abstractText summarization is a significant topic in natural language processing. Tokenization approaches are important in this regard as they underpin text recognition and processing. The aim of this paper is to research the efficiency of different tokenization approaches when summarizing Turkish texts and their combinations impact on summarization performance. Whitespace, ULM, BPE and WordPiece tokenization methods are mixed in different ways with pre-trained BERTurk, mT5 and mBART models on MLSUM dataset. We evaluate every tokenization method’s performance as well as all possible combinations based on generated summaries and ROUGE scores. Our results show that if we combine some strategies of tokenization together and use it as a hybrid method, the accuracy and consistency of the summaries will be significantly enhanced. This study gives useful hints about how to optimize models for Turkish in terms of text summarization and emphasizes on selecting suitable tokenization strategies.en_US
dc.language.isoengen_US
dc.relation.isversionof10.1109/idap64064.2024.10711036en_US
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_US
dc.titleHybrid Tokenization Strategy for Turkish Abstractive Text Summarizationen_US
dc.typeconferenceObjecten_US
dc.relation.journal2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP)en_US
dc.identifier.volume11en_US
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.contributor.departmentTAÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.identifier.startpage1en_US
dc.identifier.endpage10en_US


Bu öğenin dosyaları:

DosyalarBoyutBiçimGöster

Bu öğe ile ilişkili dosya yok.

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster