Apache Airflow Training Logo

Apache Airflow Training

Live Online & Classroom Enterprise Training

Apache Airflow is an open-source platform used to schedule, automate, and monitor workflows. It allows users to define workflows as code (DAGs) and manage complex data pipelines efficiently.

Looking for a private batch ?

REQUEST A CALLBACK

Need help finding the right training?

Your Message

  • Enterprise Reporting

  • Lifetime Access

  • CloudLabs

  • 24x7 Support

  • Real-time code analysis and feedback

What is Apache Airflow Training about?

This course provides in-depth training on Apache Airflow, an open-source platform for programmatically authoring, scheduling, and monitoring workflows. Learners will explore DAG creation, task dependencies, scheduling strategies, and Airflow architecture. The course emphasizes real-world data pipeline orchestration use cases, integration with cloud services, and best practices for managing complex workflows in production environments.

What are the objectives of Apache Airflow Training ?

  • Understand Apache Airflow architecture and components 
  • Create Directed Acyclic Graphs (DAGs) for workflow automation 
  • Schedule, monitor, and debug workflows 
  • Integrate Airflow with databases and cloud platforms 
  • Apply best practices for scalable pipeline orchestration 

Who is Apache Airflow Training for?

  • Data Engineers and ETL Developers 
  • Data Scientists managing automated workflows 
  • DevOps Engineers and Cloud Engineers 
  • BI Engineers and Analytics Professionals 
  • Software Developers building data-driven applications

What are the prerequisites for Apache Airflow Training?

Prerequisites:    

  • Basic Python programming knowledge 
  • Familiarity with SQL and databases 
  • Understanding of ETL processes 
  • Basic knowledge of Linux/command line 
  • Awareness of cloud services (AWS, GCP, Azure) is a plus 


Learning Path:   

  • Introduction to Apache Airflow and its use cases 
  • Airflow architecture and installation 
  • Creating and managing DAGs and tasks 
  • Scheduling, monitoring, and debugging workflows 
  • Scaling Airflow in production with best practices 


Related Courses:   

  • Python for Data Engineering 
  • ETL and Data Pipeline Fundamentals 
  • AWS Data Pipeline Services 
  • Kubernetes for Workflow Orchestration

Available Training Modes

Live Online Training

4 Days

Course Outline Expand All

Expand All

  • Important Prerequisites
  • The Roadmap
  • Who I am?
  • Development Environment
  • Learning Advice [Must Read]
  • Stay up to date with Apache Airflow
  • Why data orchestration?
  • Why Airflow?
  • The Core Components
  • The Core Concepts
  • How does Airflow work?
  • Airflow limitations
  • IMPORTANT
  • The Rest API
  • Introduction
  • The Project! What you will build?
  • Project materials
  • Running the new environment
  • Important
  • Import warnings are OK
  • Create the DAG with the dag decorator
  • The new way of authoring DAGs with Taskflow
  • Playing with the Taskflow API
  • Checking API availability with the Sensor decorator
  • Fetching stock prices with the PythonOperator
  • Storing stock prices in MinIO (AWS S3 like)
  • Formatting stock prices with Spark and the DockerOperator
  • Fetching formatted prices from MinIO (AWS S3 like)
  • The best way to load files into data warehouses with Postgres and Astro SDK
  • Creating the dashboard to track Apple stock with Metabase
  • The pipeline in action!
  • Getting alerts on Slack with the new Notifiers
  • Set up the new Airflow environment
  • The best way to create your DAGs
  • The parameters your DAGs need
  • DAG scheduling: the basics
  • Backfill and Catchup
  • The most important rule to follow when creating tasks
  • Play by scheduling your DAGs
  • Dealing with timezones in Airflow
  • Scheduling DAGs based on data with Datasets
  • Conditional Dataset scheduling
  • Datasets in action!
  • Sharing data between task with XComs
  • Organize your DAGs folder and clean your DAGs
  • Manage task and DAG failures
  • Test your tasks and DAGs
  • The right way of grouping tasks
  • Choosing tasks with branching and conditions
  • Changing execution behaviours with Trigger Rules
  • Templating your tasks
  • The smart way of storing data with Custom XCOM backends
  • Using variables to avoid hardcoding values
  • Executing tasks sequentially with the SequentialExecutor and SQLite
  • Executing tasks in parallel with the LocalExecutor and Postgres
  • Concurrency settings to control how tasks and dags run in parallel
  • Start scaling Airflow with the CeleryExecutor
  • Track your tasks using Flower with the CeleryExecutor
  • Add new workers and configure queues to distribute your tasks
  • Quick introduction to Kubernetes
  • Introduction to the KubernetesExecutor
  • Installing Airflow on a Kubernetes cluster
  • How to configure Airflow on Kubernetes
  • Deploying DAGs with Airflow on Kubernetes using GitSync
  • Introduction
  • Quick overview of AWS EKS
  • How to access your applications from the outside
  • Introduction
  • How the logging system works in Airflow
  • Elasticsearch Reminder
  • Introduction to metrics
  • Airflow maintenance DAGs

Who is the instructor for this training?

The trainer for this Apache Airflow Training has extensive experience in this domain, including years of experience training & mentoring professionals.

Reviews