Apache Arrow
defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.
Created:
Apache Arrow Project: https://arrow.apache.org/
Sub Projects
- Apache Arrow Ballista — Arrow DataFusion documentation – “Although Ballista is largely inspired by Apache Spark, there are some key differences.”
- Apache Arrow DataFusion — Arrow DataFusion documentation
- Apache Spark Datafusion Comet; github
- Apache Arrow Flight
- Substrait: Cross-Language Serialization for Relational Algebra
See also
- roapi/roapi: Create full-fledged APIs for slowly moving datasets without writing a single line of code.
- We built a new SQL Engine on Arrow and DataFusion | Arroyo
Articles
via.