All Tutorials

AIRFLOW TUTORIAL

Apache Airflow for Beginners

Users are complaining about slow file access and we have high disc utilization. What we need to do is reduce IO activity of top offenders using IO priorities…

Apache Airflow
Open tutorial
AIRFLOW

What Airflow is, why orchestration, install & first look

Apache Airflow is an open-source platform for orchestrating data pipelines. Learn what it is, why orchestration matters, and familiarize yourself with the UI on a real VM.

airfloworchestrationdagdata-engineering
Open tutorial
AIRFLOW

Write & run your first DAG on Airflow

You know what Airflow is and its main concepts. Now make it do something. Write a real DAG with tasks, drop it into Airflow, and watch it run to completion.

airflowdagpython
Open tutorial
AIRFLOW

Tasks, operators & dependencies in Airflow

Go beyond a single DAG. Learn how operators work, how to set dependencies between tasks, and build a multi-task pipeline with real task ordering.

AIRFLOWPYTHONDAG
Open tutorial
AIRFLOW

Scheduling, intervals & catchup in Airflow

Learn how to schedule DAGs, understand data intervals, and control catchup and backfill behavior in Airflow.

airflowdagpythonscheduling
Open tutorial
AIRFLOW

XComs and passing data between tasks in Airflow

Tasks in a DAG run in isolation and XComs let them share data. Learn how to pass values between real tasks and inspect them in the UI.

airflowdagxcompython
Open tutorial
AIRFLOW

Sensors & external triggers in Airflow

Sometimes a pipeline has to wait for a file to arrive, an API to respond, another DAG to finish. Sensors are the way how Airflow waits.

airflowsensorsdagpython
Open tutorial
AIRFLOW

The TaskFlow API in Airflow

TaskFlow lets you write DAGs as plain Python functions. Rewrite a real DAG with TaskFlow and see the difference.

airflowpythondagtaskflow
Open tutorial
AIRFLOW

Connections, hooks and providers in Airflow

DAGs need to talk to real systems such as databases, APIs, S3. Connections store the credentials, hooks let you use them in Python, providers ship pre-built integrations for everything common. Connect a DAG to a real Postgres database on the VM.

airflowconnectionspostgresdag
Open tutorial
AIRFLOW

Dynamic DAGs and task mapping in Airflow

Sometimes you don't know how many tasks you'll need until runtime. Task mapping lets Airflow generate those tasks dynamically. You'll fan out real tasks over a list and watch Airflow spawn them.

airflowdagpythontask-mapping
Open tutorial