International Journal of Leading Research Publication

E-ISSN: 2582-8010     Impact Factor: 9.56

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Monthly Scholarly International Journal

Call for Paper Volume 6 Issue 12 December 2025 Submit your research before last 3 days of to publish your research paper in the issue of December.

Big Data Processing in Cloud: Hadoop vs. Spark

Author(s) Neha Agrawal, Jasmeet Kaur, Chhaya Porwal, Shrishti Gupta, Puja Gupta
Country India
Abstract There are several sectors in the modern period that produce data on a daily basis, and the quantity of data that is produced is enormous, ranging from terabytes to petabytes. It is necessary to have big data technology in order to manage such a massive volume of data. This technology represents a significant revolution and has had an effect on the trends in applied science. The Hadoop system uses MapReduce in parallel across several nodes, which allows for the analysis of massive amounts of data. Both Map and Reduce are two of the most important functionalities of the MapReduce framework, which is used to store the vast amounts of data and information that HDFS contains. Spark was developed as a solution to the several shortcomings of MapReduce. It is capable of managing real-time data streams and performing queries in a short amount of time. DAG and RDD techniques form the foundation of the Spark framework. The purpose of this study is to make a comparison between the two fundamental characteristics of Hadoop and Spark, which will serve as the basis for the performance assessment that will be carried out.
Keywords Big Data, Resilient Distributed Datasets, MapReduce, Spark, DAG, HDFS, Hadoop.
Field Engineering
Published In Volume 6, Issue 12, December 2025
Published On 2025-12-14
Cite This Big Data Processing in Cloud: Hadoop vs. Spark - Neha Agrawal, Jasmeet Kaur, Chhaya Porwal, Shrishti Gupta, Puja Gupta - IJLRP Volume 6, Issue 12, December 2025.

Share this