Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/

License

Notifications You must be signed in to change notification settings

abhioncbr/docker-airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

docker-airflow

CircleCI License Code Climate

This is a repository for building Docker container of Apache Airflow (incubating).

Images

Image Pulls Tags
abhioncbr/docker-airflow Docker Pulls tags

Airflow components stack

  • Airflow version: Notation for representing version XX.YY.ZZ
  • Execution Mode: standalone(simple container for exploration purpose, based on sqlite as airflow metadata db & SequentialExecutor ) or prod(single node based, LocalExecutor amd mysql as airflow metadata db) and cluster (for distributed production long run use-cases, container runs as either server or worker )
  • Backend database: standalone- Sqlite, prod & cluster- Mysql
  • Scheduler: standalone- Sequential, prod- LocalExecutor and Cluster- Celery
  • Task queue: cluster- Redis
  • Log location: local file system (Default) or AWS S3 (through entrypoint-s3.sh)
  • User authentication: Password based & support for multiple users with superuser privilege.
  • Code enhancement: password based multiple users supporting super-user(can see all dags of all owner) feature. Currently, Airflow is working on the password based multi user feature.
  • Other features: support for google cloud platform packages in container.

Airflow ports

  • airflow portal port: 2222
  • airflow celery flower: 5555
  • redis port: 6379
  • log files exchange port: 8793

Airflow services information

  • In server container: redis, airflow webserver & scheduler is running.
  • In worker container: airflow worker & celery flower ui service is running.

How to build images

  • DockerFile uses airflow-version as a build-arg.
  • build image, if you want to do some customization -
       docker build -t abhioncbr/docker-airflow:$IMAGE_VERSION --build-arg AIRFLOW_VERSION=$AIRFLOW_VERSION
                  --build-arg AIRFLOW_PATCH_VERSION=$AIRFLOW_PATCH_VERSION -f ~/docker-airflow/docker-files/DockerFile .
    • Arg IMAGE_VERSION value should be airflow version for example, 1.10.3 or 1.10.2
    • Arg AIRFLOW_PATCH_VERSION value should be the major release version of airflow for example for 1.10.2 it should be 1.10.

How to run using Kitmatic

  • Simplest way for exploration purpose, using Kitematic(Run containers through a simple, yet powerful graphical user interface.)
    • Search abhioncbr/docker-airflow Image on docker-hub search-docker-airflow-Kitematic

    • Start a container through Kitematic UI. run-docker-airflow-Kitematic

How to run

  • General commands -

    • starting airflow image as a airflow-standalone container in a standalone mode-

      docker run --net=host -p 2222:2222 --name=airflow-standalone abhioncbr/airflow-XX.YY.ZZ -m=standalone &
    • Starting airflow image as a airflow-server container in a cluster mode-

      docker run --net=host -p 2222:2222 -p 6379:6379 --name=airflow-server \
      abhioncbr/airflow-XX.YY.ZZ -m=cluster -t=server -d=mysql://user:password@host:3306/db-name &
    • Starting airflow image as a airflow-worker container in a cluster mode-

      docker run --net=host -p 5555:5555 -p 8739:8739 --name=airflow-worker \
      abhioncbr/airflow-XX.YY.ZZ -m=cluster -t=worker -d=mysql://user:password@host:3306/db-name -r=redis://<airflow-server-host>:6379/0 &
  • In Mac using docker for mac -

    • Standalone Mode - starting airflow image in a standalone mode & mounting dags, code-artifacts & logs folder to host machine -

      docker run -p 2222:2222 --name=airflow-standalone \
      -v ~/airflow-data/code-artifacts:/code-artifacts \
      -v ~/airflow-data/logs:/usr/local/airflow/logs \
      -v ~/airflow-data/dags:/usr/local/airflow/dags \
      abhioncbr/airflow-XX.YY.ZZ -m=standalone &
    • Cluster Mode

      • starting airflow image as a server container & mounting dags, code-artifacts & logs folder to host machine -

        docker run -p 2222:2222 -p 6379:6379 --name=airflow-server \
        -v ~/airflow-data/code-artifacts:/code-artifacts \
        -v ~/airflow-data/logs:/usr/local/airflow/logs \
        -v ~/airflow-data/dags:/usr/local/airflow/dags \
        abhioncbr/airflow-XX.YY.ZZ \
        -m=cluster -t=server -d=mysql://user:[email protected]:3306:3306/<airflow-db-name> &
      • starting airflow image as a worker container & mounting dags, code-artifacts & logs folder to host machine -

        docker run -p 5555:5555 -p 8739:8739 --name=airflow-worker \
        -v ~/airflow-data/code-artifacts:/code-artifacts \
        -v ~/airflow-data/logs:/usr/local/airflow/logs \
        -v ~/airflow-data/dags:/usr/local/airflow/dags \
        abhioncbr/airflow-XX.YY.ZZ \
        -m=cluster -t=worker -d=mysql://user:[email protected]:3306:3306/<airflow-db-name> -r=redis://host.docker.internal:6379/0 &   

    Airflow

Distributed execution of airflow

  • As mentioned above, docker image of airflow can be leveraged to run in complete distributed run
    • single docker-airflow container in server mode for serving the UI of the airflow, redis for celery task & scheduler.
    • multiple docker-airflow containers in worker mode for executing tasks using celery executor.
    • centralised airflow metadata database.
  • Image below depicts the docker-airflow distributed platform: Distributed-Airflow

About

Repo for building docker based airflow image. Containers support multiple features like writing logs to local or S3 folder and Initializing GCP while container booting. https://abhioncbr.github.io/docker-airflow/

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published