Building Related Words in Indonesian and English Translation of Al-Qur’an Vocabulary Based on Distributional Similarity

  • Rahmad Geri Kurniawan Telkom University, Bandung, Indonesia
  • Moch. Arif Bijaksana Telkom University, Bandung, Indonesia

Abstract

The Qur'an is the Muslim holy book as the main source and guide, consisting of 114 surahs, 30 juz and has 6200 fewer verses in it. The search for relationships or arrangements of meaning between words in the Qur'an takes a long time to find and summarize. Obtained from the dictionary, encyclopedia, or thesaurus of the Al-Qur'an vocabulary, which contains each word entry has links with other words. This final project discusses the interrelations and semantic correspondence between words in the Qur'an, which supports to help find inter-related words in it, using linking with distributions that involve important parts in the word embedding. Measurement of the relevance of the word measurement with semantic similarity which is one of the lessons learned in Natural Language Processing (NLP). Extraordinary similarity measures the proximity of a word vector using cosine similarity. The process of converting words in the form of vectors using the fasttext which is the development of the Word2vec algorithm. The dataset is used for translations of the word Al-Qur'an in English and Indonesian. This entry becomes an input into the system then produces a score that represents the interrelationship between words. Evaluation of system output results is to perform performance calculations using Pearson correlation involving the gold standard.

Published
2020-06-12
Section
Articles