
International Journal of Leading Research Publication
E-ISSN: 2582-8010
•
Impact Factor: 9.56
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 6 Issue 8
August 2025
Indexing Partners



















Migrating Spark Jobs from On-Premises to GCP Cloud Dataproc
Author(s) | Suhas Hanumanthaiah |
---|---|
Country | United States |
Abstract | The growing demand for scalable, cost-efficient, and agile data processing solutions has driven organizations to migrate big data workloads from on-premises environments to cloud platforms. Apache Spark, a widely adopted distributed computing framework, plays a pivotal role in processing large-scale datasets, and its migration to the cloud has become a strategic imperative. This research paper provides a comprehensive exploration of migrating Apache Spark jobs to Google Cloud Platform (GCP) using Dataproc—a fully managed, scalable, and cost-effective service for Spark and Hadoop workloads. The study evaluates various migration strategies, including lift-and-shift, cloud-native re-architecting, and hybrid approaches, while analyzing critical factors such as resource management, job scheduling, storage integration, and configuration optimization. Emphasis is placed on performance tuning through intelligent frameworks, zero-execution configuration techniques, and reinforcement learning-based optimization, all of which significantly enhance Spark performance in the cloud. Real-world case studies from domains such as healthcare, bioinformatics, and real-time analytics illustrate practical benefits including performance gains, operational efficiency, and improved scalability. The paper concludes by offering best practices for successful migration, recommendations for production readiness, and insights into future trends such as serverless computing, AI integration, and edge convergence. These findings provide a robust foundation for enterprises planning to modernize their big data infrastructure through cloud migration. |
Keywords | Apache Spark, Google Cloud Platform (GCP), Dataproc, Cloud Migration |
Field | Engineering |
Published In | Volume 6, Issue 1, January 2025 |
Published On | 2025-01-08 |
Cite This | Migrating Spark Jobs from On-Premises to GCP Cloud Dataproc - Suhas Hanumanthaiah - IJLRP Volume 6, Issue 1, January 2025. DOI 10.70528/IJLRP.v6.i1.1695 |
DOI | https://doi.org/10.70528/IJLRP.v6.i1.1695 |
Short DOI | https://doi.org/g9v44x |
Share this


CrossRef DOI is assigned to each research paper published in our journal.
IJLRP DOI prefix is
10.70528/IJLRP
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
