Penerapan Word2Vec untuk analisis hubungan semantik kata pada dataset terjemahan bahasa Inggris Al-Qur'an

Pohan, Rahmaniyah A'laa (2025) Penerapan Word2Vec untuk analisis hubungan semantik kata pada dataset terjemahan bahasa Inggris Al-Qur'an. Sarjana thesis, UIN Sunan Gunung Djati Bandung.

[img]
Preview
Text
1_cover.pdf

Download (173kB) | Preview
[img]
Preview
Text
2_abstrak.pdf

Download (210kB) | Preview
[img]
Preview
Text
3_skbebasplagiarism.pdf

Download (251kB) | Preview
[img]
Preview
Text
4_daftarisi.pdf

Download (280kB) | Preview
[img]
Preview
Text
5_bab1.pdf

Download (327kB) | Preview
[img] Text
6_bab2.pdf
Restricted to Registered users only

Download (446kB) | Request a copy
[img] Text
7_bab3.pdf
Restricted to Registered users only

Download (580kB) | Request a copy
[img] Text
8_bab4.pdf
Restricted to Registered users only

Download (2MB) | Request a copy
[img] Text
9_bab5.pdf
Restricted to Registered users only

Download (202kB) | Request a copy
[img] Text
10_daftarpustaka.pdf
Restricted to Registered users only

Download (190kB) | Request a copy
[img] Text
11_lampiran.pdf
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Al-Qur’an memiliki kedalaman makna yang sering sulit ditangkap secara utuh dalam terjemahan, khususnya dari bahasa Arab ke bahasa Inggris. Perbedaan struktur gramatikal dan semantik antar bahasa ini dapat menimbulkan ambiguitas makna, sehingga dibutuhkan pendekatan analisis yang mampu merepresentasikan hubungan semantik antar kata secara lebih akurat. Penelitian ini menggunakan model Word2Vec untuk menganalisis kemiripan semantik kata dalam terjemahan Al-Qur’an berbahasa Inggris. Dataset yang digunakan mencakup 6.236 ayat terjemahan dari Tanzil.net, dengan total 146.163 kata dan 11.214 kosakata unik, yang diproses melalui data cleaning, case folding, tokenisasi, stopword removal, dan lemmatisasi. Model dilatih dengan dua arsitektur Word2Vec, yaitu CBOW dan Skip-Gram, menggunakan variasi dimensi vektor, window size, dan epoch. Evaluasi dilakukan dengan cosine similarity pada kata-kata religius yang dipilih secara tematik. Hasil penelitian menunjukkan bahwa model CBOW dengan konfigurasi dimensi 100 dan window size 3 memberikan hasil terbaik dengan nilai similarity 0.9905 pada kata target prophet. Nilai similarity yang tinggi tersebut juga tercermin dari kata-kata hasil perhitungan yang memiliki kedekatan makna dengan kata prophet, sehingga semakin menguatkan bahwa representasi vektor yang dihasilkan sudah sesuai secara semantik. Dengan demikian, Word2Vec terbukti efektif dalam memetakan makna kata pada teks terjemahan Al-Qur’an dan berpotensi mendukung pengembangan analisis semantik dalam studi keislaman. The Qur’an contains profound meanings that are often difficult to fully capture in translation, particularly from Arabic into English. Differences in grammatical structures and semantic nuances between the two languages may lead to ambiguity of meaning, thereby requiring an analytical approach capable of representing semantic relationships between words more accurately. This study employs the Word2Vec model to analyze semantic similarity of words in the English translation of the Qur’an. The dataset consists of 6,236 translated verses from Tanzil.net, comprising a total of 146,163 words and 11,214 unique vocabularies, which were preprocessed through data cleaning, case folding, tokenization, stopword removal, and lemmatization. The model was trained using two Word2Vec architectures, namely CBOW and Skip-Gram, with variations in vector dimensions, window sizes, and epochs. Evaluation was conducted using cosine similarity on thematically selected religious terms. The results show that the CBOW model with a vector dimension of 100 and a window size of 3 achieved the best performance, yielding a similarity score of 0.9905 for the target word prophet. This high similarity score is further reflected in the fact that the computed similar words indeed share close semantic meanings with prophet, thereby reinforcing that the generated vector representations are semantically appropriate. Consequently, Word2Vec is proven to be effective in mapping word meanings in the Qur’an translation text and holds potential to support semantic analysis in Islamic studies.

Item Type: Thesis (Sarjana)
Uncontrolled Keywords: Word2Vec; semantic similarity; Al-Quran; cosine similarity; CBOW; Skip-Gram;
Subjects: Al-Qur'an (Al Qur'an, Alquran, Quran) dan Ilmu yang Berkaitan > Al-Qur'an dan Terjemahannya
Mathematics
Applied mathematics > Programming Mathematics
Divisions: Fakultas Sains dan Teknologi > Program Studi Matematika
Depositing User: Rahmaniyah A'laa Pohan
Date Deposited: 09 Sep 2025 06:56
Last Modified: 09 Sep 2025 06:56
URI: https://digilib.uinsgd.ac.id/id/eprint/118064

Actions (login required)

View Item View Item