About this training
Apache Airflow is a powerful open source workflow tool for automating, scheduling and monitoring workflows. The course is aimed at people who want to use Apache Airflow to build and manage end-to-end data pipelines.
It is cloud agnostic and can be deployed natively in the cloud as well as in Kubernetes. This makes Airflow perfectly suited for orchestrating simple tasks all the way up to complex scalable ETL routes.
In the process, the following questions, among others, are answered
- How is Apache Airflow structured?
- How do I install Apache Airflow?
- How do I write a workflow and define tasks and their dependencies?
- How do I schedule a workflow?
- How do I create complex and clear ETL workflows?
- How do I exchange data between tasks?
- How do I create and work with dynamic tasks?
- How do I create a dynamic analytical ETL pipeline?
- How do I design a clearer workflow?
- How do I exchange data between tasks?
- How do I create and work with dynamic tasks?
- How do I create an analytical ETL pipeline dynamically?
Requirements
Required knowledge
You should have knowledge in the following areas:
- IT Basics
- Python Basics
Technical requirements
For our online trainings all participants need ...
- a computer with Internet access.
- a stable Internet connection.
- an updated browser, preferably Chrome.
Course of the training days
Day1
- Apache Airflow Introduction
- Airflow Architecture
- Directed Acyclic Graph (DAG) and its components
- DAG Deep Dive Part 1
- Idempotency and Atomicity
Day2
- DAG Deep Dive Part 2
- Skipping tasks in a workflow
- Communication with external system
Day3
- Data Transfer into AWS
- Testing Workflows
- Workflow in CLI
- Apache Airflow in Cloud
Additional modules
We support you every step of the way – from advice to implementation:
Make an inquiry