Building Related Words in Indonesian and English Translation of Al-Qur’an Vocabulary Based on Distributional Similarity
Abstract
The Qur'an is the Muslim holy book as the main source and guide, consisting of 114 surahs, 30 juz and has 6200 fewer verses in it. The search for relationships or arrangements of meaning between words in the Qur'an takes a long time to find and summarize. Obtained from the dictionary, encyclopedia, or thesaurus of the Al-Qur'an vocabulary, which contains each word entry has links with other words. This final project discusses the interrelations and semantic correspondence between words in the Qur'an, which supports to help find inter-related words in it, using linking with distributions that involve important parts in the word embedding. Measurement of the relevance of the word measurement with semantic similarity which is one of the lessons learned in Natural Language Processing (NLP). Extraordinary similarity measures the proximity of a word vector using cosine similarity. The process of converting words in the form of vectors using the fasttext which is the development of the Word2vec algorithm. The dataset is used for translations of the word Al-Qur'an in English and Indonesian. This entry becomes an input into the system then produces a score that represents the interrelationship between words. Evaluation of system output results is to perform performance calculations using Pearson correlation involving the gold standard.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Jurnal Teknologi Informasi dan Terapan (J-TIT) and Department of Information Technology, Politeknik Negeri Jember as publisher of the journal. Copyright encompasses rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations. Authors should sign a copyright transfer agreement when they have approved the final proofs sent by Jurnal Teknologi Informasi dan Terapan (J-TIT) prior to the publication. The copyright transfer agreement can be download here .
Jurnal Teknologi Informasi dan Terapan (J-TIT) and Department of Information Technology, Politeknik Negeri Jember and the Editors make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in Jurnal Teknologi Informasi dan Terapan (J-TIT) are the sole responsibility of their respective authors and advertisers.
Users of this website will be licensed to use materials from this website following the Creative Commons Attribution 4.0 International License. No fees charged. Please use the materials accordingly.
This work is licensed under a Creative Commons Attribution-Share A like 4.0 International License
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.