Announcing support for GROUP BY, SUM, and other aggregation queries in R2 SQL
2025-12-18
11 min read
2
by Jérôme Schneider
Endigest AI Core Summary
Cloudflare announces support for GROUP BY, SUM, and other aggregation queries in R2 SQL, its serverless analytics query engine over R2 Data Catalog.
- •Aggregations split into two phases: pre-aggregate computation on worker nodes, then final merge at the coordinator (scatter-gather)
- •Pre-aggregates allow horizontal scaling: e.g., count(*) pre-aggregate is a partial row count, avg(value) stores sum and count separately
- •Scatter-gather fails for ORDER BY/HAVING on aggregates when grouping by high-cardinality columns, as local top-N results can miss global leaders
- •Shuffling solves this via deterministic hash partitioning: each worker routes rows to the same destination worker based on the GROUP BY key hash
- •A synchronization barrier ensures all workers finish sending data before any worker computes final aggregates
Tags:
#R2
#Data
#Edge Computing
#Rust
#Serverless
#SQL
