Penerapan Explainable AI pada model Indobert untuk deteksi potensi berita hoaks menggunakan Shapley Additive Explanations (SHAP)

Adikusumah, Moch Rifky Aulia (2026) Penerapan Explainable AI pada model Indobert untuk deteksi potensi berita hoaks menggunakan Shapley Additive Explanations (SHAP). Sarjana thesis, UIN Sunan Gunung Djati Bandung.

This is the latest version of this item.

Preview

Text
1_cover.pdf
Download (322kB) | Preview

Preview

Text
2_abstrak.pdf
Download (185kB) | Preview

Preview

Text
3_skbebasplagiarism.pdf
Download (441kB) | Preview

Preview

Text
4_daftarisi.pdf
Download (246kB) | Preview

Preview

Text
5_bab1.pdf
Download (507kB) | Preview

Text
6_bab2.pdf
Restricted to Registered users only
Download (2MB)

Text
7_bab3.pdf
Restricted to Registered users only
Download (872kB)

Text
8_bab4.pdf
Restricted to Registered users only
Download (6MB)

Text
9_bab5.pdf
Restricted to Registered users only
Download (177kB)

Text
10_daftarpustaka.pdf
Restricted to Registered users only
Download (490kB)

Text
11_lampiran.pdf
Restricted to Repository staff only
Download (1MB)

Abstract

INDONESIA: Penyebaran berita hoaks di era digital menjadi ancaman bagi integritas informasi, sehingga diperlukan sistem deteksi otomatis yang akurat dan transparan. Pendekatan klasifikasi teks menggunakan model IndoBERT terbukti memiliki performa tinggi, namun sifatnya yang black-box membuat alasan di balik pengambilan keputusannya sulit dipahami. Penelitian ini bertujuan untuk menerapkan Explainable Artificial Intelligence (XAI) menggunakan metode SHapley Additive exPlanations (SHAP) pada model IndoBERT untuk mengidentifikasi dan mengukur kontribusi fitur linguistik yang mempengaruhi keputusan model dalam mendeteksi berita hoaks berbahasa Indonesia. Penelitian ini menggunakan kerangka kerja OSEMN (Obtain, Scrub, Explore, Model, iNterpret) dengan memanfaatkan 24.740 data hasil web scraping dari portal berita seperti CNN, Kompas, dan Tempo untuk teks fakta serta Turnbackhoax untuk teks hoaks. Hasil evaluasi pemodelan menunjukkan bahwa hasil yang terbaik adalah pada rasio pembagian data 70:20:10 dengan F1-Score yaitu 99,46%. Meskipun matrik evaluasi menunjukkan hasil yang sangat tinggi, interpretasi melalui SHAP mengungkap bahwa model cenderung bertindak sebagai pengklasifikasi gaya bahasa dibandingkan pengklasifikasi kebenaran faktual, selain itu moel terindikasi overfitting ringan. Model sangat bergantung pada kosa kata formal untuk memprediksi kelas fakta, serta bahasa hiperbola dan provokatif untuk mengenali kelas hoaks. Untuk mengatasi keterbatasan sifat black-box tersebut, hasil interpretasi SHAP digabungkan dengan penjelasan naratif menggunakan pendekatan Rule-Based Natural Language Generation (NLG) yang diimplementasikan ke dalam antarmuka berbasis web. ENGLISH: The spread of misinformation in the digital era poses a significant threat to information integrity, thereby necessitating accurate and transparent automated detection systems. Text classification approaches using the IndoBERT model have demonstrated high performance; however, their black-box nature makes the reasoning behind their decisions difficult to interpret. This study aims to implement Explainable Artificial Intelligence (XAI) using the SHapley Additive exPlanations (SHAP) method on the IndoBERT model to identify and quantify the contribution of linguistic features influencing the model's decisions in detecting Indonesian-language misinformation. This research adopts the OSEMN (Obtain, Scrub, Explore, Model, iNterpret) framework, utilizing 24,740 data points collected through web scraping from news portals such as CNN, Kompas, and Tempo for factual texts, as well as Turnbackhoax for misinformation texts. The modeling evaluation results indicate that the best performance is achieved with a 70:20:10 data split ratio, yielding an F1-Score of 99.46%. Despite the exceptionally high evaluation metrics, SHAP-based interpretation reveals that the model tends to function more as a stylistic classifier rather than a factual truth classifier, furthermore the model exhibits indications of mild overfitting. The model heavily relies on formal vocabulary to predict factual content, while hyperbolic and provocative language is strongly associated with misinformation classification. To address the limitations of the black-box nature, SHAP interpretation results are integrated with narrative explanations using a Rule-Based Natural Language Generation (NLG) approach, which is implemented within a web-based interface.

Item Type:	Thesis (Sarjana)
Uncontrolled Keywords:	Deteksi Hoaks; Hoax Detection; IndoBERT; Explainable Artificial Intelligence; SHAP; Natural Language Generation
Subjects:	Technology, Applied Sciences
Divisions:	Fakultas Sains dan Teknologi > Program Studi Teknik Informatika
Depositing User:	Moch Rifky Aulia Adikusumah
Date Deposited:	22 Jun 2026 07:09
Last Modified:	22 Jun 2026 07:09
URI:	https://digilib.uinsgd.ac.id/id/eprint/132993

Available Versions of this Item

Penerapan Explainable AI pada model Indobert untuk deteksi potensi berita hoaks menggunakan Shapley Additive Explanations (SHAP). (deposited 22 Jun 2026 07:09) [Currently Displayed]

Actions (login required)

View Item