Communications on Applied Electronics |
Foundation of Computer Science (FCS), NY, USA |
Volume 7 - Number 32 |
Year of Publication: 2019 |
Authors: Aladesote O. Isaiah, Adetunji A. Ademola |
10.5120/cae2019652845 |
Aladesote O. Isaiah, Adetunji A. Ademola . Data Deduplication: Its Significant Effect on Network Intrusion Dataset. Communications on Applied Electronics. 7, 32 ( Dec 2019), 21-26. DOI=10.5120/cae2019652845
This research work adopted future extraction techniques on NSL KDD data set, using deduplication software written in C++ Programming Language, duplicated records of four attack types (DOS, R2L, Robing and U2R) were removed. Among the attack types for DOS, Mailbomb with 98.63% has highest percentage reduction rate while Apache2 with 40.30% reduction rate has the least. For R2L, Smpgetattack with 92.70% reduction has the highest while there was no reduction for Ftp_write. With 93.15% reduction, Nmap has the highest reduction rate under Probing attack while Mscan with 60.84% reduction rate has the least while 50% reduction rate for Sqlattack is the highest for U2R attack type. Wilcoxon Sign test is used to test for the significance of the deduplication and results revealed that all the attack types except U2R have significant reduction rate at 5% level.