Optimization of Sorting Algorithms for Big Data and Cloud Computing Environments

Vaibhav S. Makwana; Ekta H. Unagar; Dhaval R. Chandarana

doi:10.64643/JATIRV1I1-140030-001

Author(s)

Vaibhav S. Makwana, Ekta H. Unagar, Dhaval R. Chandarana

Manuscript ID: 140030
Volume: 1
Issue: 1
Pages: 235–262

Subject Area: Computer Science

DOI: https://doi.org/10.64643/JATIRV1I1-140030-001

Abstract

The effectiveness of sorting algorithms has become essential to modern computing in the age of digital transformation, where data is generated at enormous speeds and scales. Large-scale analytics, cloud storage, and distributed machine learning are all supported by sorting operations; however, the scale, heterogeneity, and distributed nature of contemporary systems pose challenges for conventional algorithms like Quicksort, Mergesort, and Heapsort. The evolution from traditional in-memory methods to distributed, adaptive, and hardware-accelerated approaches is highlighted in this review of recent developments in sorting algorithm optimization for big data and cloud environments. The value of algorithmic and architectural co-design has been demonstrated by the up to 5.31× speedup, 6× lower shuffle overhead, and 73% shorter execution times achieved by modern techniques that incorporate learned-model-based partitioning, SSD-internal computation, and framework-level innovations. Future directions focus on AI-driven adaptivity, skew-resilient partitioning, and energy-efficient cloud-native frameworks for scalable, intelligent, and sustainable sorting in big data systems, while persistent issues like I/O bottlenecks, data skew, and hardware integration complexity still exist.

Keywords

Sorting AlgorithmsBig DataCloud ComputingExternal Merge SortHadoopApache SparkData PartitioningDistributed SystemsParallel ComputingMachine Learning OptimizationLearned Sorting ModelsSSD-Based SortingAdaptive SortingShuffle OptimizationEnergy-Efficient ComputingPerformance BenchmarkingAlgorithm-System Co-Design.