Enhanced Speed Processing of Data using in-Memory Analytics

T.Selva Divya, M.Jayanthi, V.Vinoth Kumar

Abstract


Database is used to store large amount of data. When the data size is growing, it is difficult to process using traditional data processing application or tools. Today many organizations’s information is growing and they need a large data tools to store a huge amount of data. So, big data tools arrive because of the drawback for storage and processing. Hadoop is an open source software and support many applications which support petabyte sized analytics. This paper deals with working of Hadoop. HDFS is used for data storage in Hadoop. It distributes the work to nodes and communicates with a single named server node and if the name server goes offline. HDFS must restart where it left out and it causes some latency or delay in work of the system. Spark solves the problem of HDFS. It is a column oriented distributed database and has a fault tolerant than HDFS. Spark is In-memory database where the queries of data are retrieved from RAM instead of physical disk the processing speed of spark is much faster than Hadoop system. Map reduce is used for data processing and it splits the work and reduces it into single subset.

 

Keywords: Big Data, Hadoop System, HDFS, MapReduce, Spark, Cassandra, Databases.


Full Text:

PDF


DOI: https://doi.org/10.26483/ijarcs.v4i11.1950

Refbacks

  • There are currently no refbacks.




Copyright (c) 2016 International Journal of Advanced Research in Computer Science