AURON

The Auron accelerator for big data engine (e.g., Spark, Flink) leverages native vectorized execution to accelerate query processing. It combines the power of the Apache DataFusion library and the scale of the distributed computing framework.

Auron takes a fully optimized physical plan from distributed computing framework, mapping it into DataFusion's execution plan, and performs native plan computation.

The key capabilities of Auron include:

Native execution: Implemented in Rust, eliminating JVM overhead and enabling predictable performance.
Vectorized computation: Built on Apache Arrow's columnar format, fully leveraging SIMD instructions for batch processing.
Pluggable architecture:: Seamlessly integrates with Apache Spark while designed for future extensibility to other engines.
Production-hardened optimizations: Multi-level memory management, compacted shuffle formats, and adaptive execution strategies developed through large-scale deployment.

Based on the inherent well-defined extensibility of DataFusion, Auron can be easily extended to support:

Various object stores.
Operators.
Simple and Aggregate functions.
File formats.

We encourage you to extend DataFusion capability directly and add the supports in Auron with simple modifications in plan-serde and extension translation.

Build from source

To build Auron, please follow the steps below:

Install Rust

The native execution lib is written in Rust. So you're required to install Rust (nightly) first for compilation. We recommend you to use rustup.

Install JDK

Auron has been well tested on jdk8/11/17.

Check out the source code.
Build the project.

use ./auron-build.sh for building the project. execute ./auron-build.sh --help for help.

After the build is finished, a fat Jar package that contains all the dependencies will be generated in the target directory.

Build with docker

You can use the following command to build a centos-7 compatible release:

SHIM=spark-3.3 MODE=release JAVA_VERSION=8 SCALA_VERSION=2.12 ./release-docker.sh

Run Spark Job with Auron Accelerator

This section describes how to submit and configure a Spark Job with Auron support.

Move the Auron JAR to the Spark client classpath (normally spark-xx.xx.xx/jars/).
Add the following configs to spark configuration in spark-xx.xx.xx/conf/spark-default.conf:

spark.auron.enable true
spark.sql.extensions org.apache.spark.sql.auron.AuronSparkSessionExtension
spark.shuffle.manager org.apache.spark.sql.execution.auron.shuffle.AuronShuffleManager
spark.memory.offHeap.enabled false

# suggested executor memory configuration
spark.executor.memory 4g
spark.executor.memoryOverhead 4096

submit a query with spark-sql, or other tools like spark-thriftserver:

spark-sql -f tpcds/q01.sql

Performance

TPC-DS 1TB Benchmark (for details, see https://auron-project.github.io/documents/benchmarks.html):

We also encourage you to benchmark Auron and share the results with us. 🤗

Community

Subscribe Mailing Lists

Mail List is the most recognized form of communication in the Apache community. Contact us through the following mailing list.

Name	Scope
[email protected]	Development-related discussions	Subscribe	Unsubscribe

Report Issues or Submit Pull Request

If you meet any questions, connect us and fix it by submitting a 🔗Pull Request.

License

Auron is licensed under the Apache 2.0 License. A copy of the license can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 1,432 Commits
.github		.github
.idea		.idea
benchmark-results		benchmark-results
build		build
common		common
dev		dev
hadoop-shim		hadoop-shim
native-engine		native-engine
spark-extension-shims-spark3		spark-extension-shims-spark3
spark-extension		spark-extension
spark-version-annotation-macros		spark-version-annotation-macros
thirdparty		thirdparty
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.rat-excludes		.rat-excludes
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE.txt		LICENSE.txt
README.md		README.md
RELEASES.md		RELEASES.md
auron-build.sh		auron-build.sh
pom.xml		pom.xml
release-docker.sh		release-docker.sh
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml
scalafix.conf		scalafix.conf
scalafmt.conf		scalafmt.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

AURON

Build from source

Build with docker

Run Spark Job with Auron Accelerator

Performance

Community

Subscribe Mailing Lists

Report Issues or Submit Pull Request

License

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

turboFei/blaze

Folders and files

Latest commit

History

Repository files navigation

AURON

Build from source

Build with docker

Run Spark Job with Auron Accelerator

Performance

Community

Subscribe Mailing Lists

Report Issues or Submit Pull Request

License

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages