Communications on Applied Electronics |
Foundation of Computer Science (FCS), NY, USA |
Volume 1 - Number 6 |
Year of Publication: 2015 |
Authors: Yamini Warke, Arti Mohanpurkar |
10.5120/cae-1565 |
Yamini Warke, Arti Mohanpurkar . Contraption of Suffix Array Blocking for Efficacious Record Linkage and De-duplication. Communications on Applied Electronics. 1, 6 ( April 2015), 6-9. DOI=10.5120/cae-1565
Information is united for common purpose from many sidedness computerized files is referred as record linkage. The basic methods compare name and address information across pairs of files to determine those pairs of records that are associated with the same entity. An entity might be a business, a person, or some other type of unit that is listed. De-duplication is a scold of identifying one or more records in receptacle which represents same object or entity. The same data may be depicting in different way in all possible database causing problem. Diverse indexing techniques have been elaborated for record linkage and de-duplication, in modern time. They are intended to reducing the number of record pairs to be compared in similarity matching process, while at the same time maintaining high matching quality. This paper presents, contraption of suffix array blocking for efficacious record linkage and de-duplication based on different similarity measures.