Communications on Applied Electronics |
Foundation of Computer Science (FCS), NY, USA |
Volume 6 - Number 2 |
Year of Publication: 2016 |
Authors: Fatma Elghannam |
10.5120/cae2016652430 |
Fatma Elghannam . Automatic Measurement of Semantic Similarity among Arabic Short Texts. Communications on Applied Electronics. 6, 2 ( Nov 2016), 16-21. DOI=10.5120/cae2016652430
Documents that are dealing with the same topic include normally many identical words. Accordingly, surface words co-occurrence similarity measures has been applied successfully to measure the similarity between these documents. However, the problem is not a trivial task when dealing with short texts that carry the same or close meaning but with different vocabularies. Toward solving this problem, researchers have been investigating methods for word analysis at the semantic level. We introduce a new method to measure the semantic similarity between short texts. In the proposed method, semantic distribution and lexical similarity measures are combined to determine the degree of similarity between two words. The similarity between two words is measured as the lexical similarity between the vectors of similar words extracted from corpus as a second order word vector. The proposed method was applied to measure the semantic similarity between Arabic short texts. The experiments performed showed that the best accuracy achieved by the proposed method was 97% compared to 93% recorded for the second order distribution similarity.