PayPal shares how they reduced Apache Spark job cloud costs by up to 70% by migrating from CPU-based Spark 2 to GPU-accelerated Spark 3 using NVIDIA's Spark RAPIDS.
•Spark RAPIDS enables transparent GPU acceleration for Spark workloads without rewriting Python/Scala/Java/SQL code, by translating operations to GPU programming languages automatically
•GPU parallelism in Spark RAPIDS is intra-task (within each task's data), unlike CPU Spark which parallelizes only at the task level
•Setting spark.sql.files.maxPartitionBytes to 2GB reduced partitions from ~185,000 to ~10,000 (20x fewer), cutting stage runtime by over 30%
•Spark 3's Adaptive Query Execution (AQE) dynamically adjusts shuffle partition sizes to keep workloads compute-bound rather than I/O or network-bound
•NVIDIA's Qualification Tool was used to identify which existing Spark jobs were best candidates for GPU migration