Big Data and Spark Workshop

Big Data Workshops with hands-on tutorials for working with S3, Spark, Delta Lake, Trino, ...

This workshop is used in the Big Data and Spark Ecosystem Module of the Data Engineering CAS at the Berner Fachhochschule.

All the workshops can be done on a container-based infrastructure using Docker Compose for the container orchestration. It can be run on a local machine or in a cloud environment. Check 01-environment for instructions on how to setup the infrastructure.

Name		Name	Last commit message	Last commit date
Latest commit History 358 Commits
01-environment		01-environment
02a-minio-object-storage		02a-minio-object-storage
02b-aws-object-storage		02b-aws-object-storage
03-spark-getting-started		03-spark-getting-started
04-spark-dataframe		04-spark-dataframe
05-spark-application		05-spark-application
06-data-types		06-data-types
07-spark-deltalake		07-spark-deltalake
08-spark-graphframe		08-spark-graphframe
09-sql-on-bigdata-with-trino		09-sql-on-bigdata-with-trino
10-data-ingestion-with-nifi		10-data-ingestion-with-nifi
11a-scheduling-with-airflow-3.x		11a-scheduling-with-airflow-3.x
11b-scheduling-with-airflow-2.x		11b-scheduling-with-airflow-2.x
12-dbt-spark		12-dbt-spark
99-misc/98-docker-commands		99-misc/98-docker-commands
99-python-introduction		99-python-introduction
xx-sql-on-bigdata-with-starrocks		xx-sql-on-bigdata-with-starrocks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Big Data and Spark Workshop

Workshops

About

Uh oh!

Releases

Packages

Uh oh!

Languages

gschmutz/bigdata-spark-workshop

Folders and files

Latest commit

History

Repository files navigation

Big Data and Spark Workshop

Workshops

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages