Communications on Applied Electronics |
Foundation of Computer Science (FCS), NY, USA |
Volume 2 - Number 1 |
Year of Publication: 2015 |
Authors: Mohit Sewak, Sachchidanand Singh |
10.5120/cae-1651 |
Mohit Sewak, Sachchidanand Singh . A Reference Architecture and Road map for Enabling E-commerce on Apache Spark. Communications on Applied Electronics. 2, 1 ( June 2015), 37-42. DOI=10.5120/cae-1651
Apache Spark is an execution engine that besides working as an isolated distributed, in-memory computing engine also offers close integration with Hadoop's distributed file system (HDFS). Apache Spark's underlying appeal is in providing a unified framework to create sophisticated applications involving workloads. It unifies multiple workloads, handles unstructured data very well and has easy-to-use APIs. Apache Spark also offers a streaming component called Spark Streaming, which can write the streamed data in the same data structures, also resides in-memory and can also be read by the Spark's Spark SQL component running on top of core Spark framework. Apache Spark has the ability to provide online machine learning, through its MLlib, and SparkR sub projects. With these, besides streaming data it can also execute machine-learning libraries, functions or algorithms. This paper analyzes Apache Spark and highlights the role of Apache Spark (and eco-system) in the architecture of a modern E-commerce platform. This paper also aims to propose horizontally and vertically scalable reference architectures for both small and medium (SME) & large E-commerce enterprises.