Unraveling the capabilities of language models in news summarization performance evaluation and comparative study

dc.contributor.advisorBiricik, Göksel
dc.contributor.authorOdabaşı, Abdurrahman
dc.date.accessioned2024-11-27T06:47:47Z
dc.date.available2024-11-27T06:47:47Z
dc.date.issued2024
dc.date.submitted2024-07-18
dc.departmentTAÜen_US
dc.description.abstractGiven the recent introduction of multiple public Large Language Models (LLMs) andthe ongoing demand for improved Natural Language Processing tasks, particularlysummarization, this thesis provides a comprehensive benchmarking of 20 recent LLMson the news summarization task. The study systematically evaluates the capabilityand effectiveness of these models in summarizing news articles across different styles,utilizing three distinct datasets. Specifically, this study focuses on zero-shot and few-shot learning settings, employing a robust evaluation methodology that integratesautomatic metrics, human evaluation, and LLM-as-a-judge. Interestingly, includingdemonstration examples in the few-shot learning setting did not enhance models’ per-formance and, in some cases, even led to worse outcomes. This issue arises mainlydue to the poor quality of the gold summaries used as references, which hinders themodels’ learning process and negatively impacts their performance. Furthermore, ourstudy’s results highlight the exceptional performance of GPT-3.5 and GPT-4, whichgenerally dominate due to their advanced capabilities. However, among the publicmodels evaluated, certain models such as Qwen1.5-7B, SOLAR-10.7B-Instruct-v1.0,and Zephyr-7B-Beta demonstrated promising results. These models showed signifi-cant potential, positioning them as competitive alternatives to private models for thetask of news summarization.
dc.identifier.citationOdabaşı, A. (2024). Unraveling the capabilities of language models in news summarization performance evaluation and comparative study. Türk-Alman Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği, Yüksek Lisans Programı.
dc.identifier.urihttps://hdl.handle.net/20.500.12846/1416
dc.language.isoen
dc.publisherTürk-Alman Üniversitesi Fen Bilimler Enstitüsü
dc.relation.publicationcategoryTez
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectAutomatic text summarizationen_US
dc.subjectNews summarizationen_US
dc.subjectNatural language generationen_US
dc.subjectGenerative arti-ficial intelligenceen_US
dc.subjectIn context learningvien_US
dc.titleUnraveling the capabilities of language models in news summarization performance evaluation and comparative study
dc.typeMaster Thesis

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Odabaşı, Abdurrahman.pdf
Boyut:
16.57 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tez
Lisans paketi
Listeleniyor 1 - 1 / 1
[ X ]
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama:

Koleksiyon