AIRFLOW

Sensors & external triggers in Airflow

Sometimes a pipeline has to wait for a file to arrive, an API to respond, another DAG to finish. Sensors are the way how Airflow waits.

What we're doing

You'll learn what sensors are, the difference between poke and reschedule modes, and build a DAG where a sensor waits for a file before the next task runs.

Step 1: Pipelines that depend on the outside world

A scheduled DAG fires at a fixed time. But what if your pipeline depends on something it doesn't control?

  • A daily report can only run after the upstream team drops their CSV in S3 and they don't always hit the deadline
  • A transform can only start after an external batch job finishes
  • A processing task can only run after an API endpoint becomes ready
    That's the job of a sensor.

Step 2: What sensors are

A sensor is a special kind of operator. Instead of doing work, it checks a condition repeatedly. If the condition is false, it waits. If it's true, the sensor succeeds and downstream tasks run. Three parameters every sensor has:

  • poke_interval — how often to check, in seconds. Default is 60 seconds
  • timeout — how long to wait before giving up, in seconds. Default is 7 days
  • modepoke or reschedule (covered in the next step) Airflow ships with sensors for common conditions:
  • FileSensor — waits for a file or folder to appear
  • S3KeySensor — waits for a key in an S3 bucket
  • HttpSensor — waits for an HTTP endpoint to return a successful response
  • ExternalTaskSensor — waits for a task in another DAG to finish

Step 3: Poke vs reschedule mode

Poke mode (default) — the sensor holds onto its worker slot the entire time it's waiting. Even if it's just sleeping between pokes, the worker is reserved for it. Fine if the wait is short — a few seconds, maybe a minute.
Reschedule mode — between pokes the sensor releases the worker slot. Airflow puts the task back in the queue and another task can use the worker. When the next poke interval arrives, the sensor is picked back up.
The rule:

  • Short wait (under a minute) → use mode="poke"
  • Long wait (minutes, hours, days) → use mode="reschedule" If you have 10 sensors in poke mode all waiting on slow files, they'll lock up 10 worker slots and your other tasks will starve. Reschedule mode prevents that.

Step 4: Create the DAG file

Click VS Code in the environment panel. Right click on the dags folder and create a new file called sensor.py.

Step 5: Add the imports and the sensor

from airflow import DAG 
from airflow.operators.python import PythonOperator 
from airflow.sensors.filesystem import FileSensor 
from datetime import datetime 

Different sensors live in different modules; check the docs when you need a new one.
Now the sensor task:

wait_for_file = FileSensor( 
task_id="wait_for_file", filepath="/opt/airflow/dags/trigger.txt", 
poke_interval=10, 
timeout=300, 
mode="reschedule" ) 
  • filepath — the file the sensor is watching for. Inside the Docker container, the dags folder is mounted at /opt/airflow/dags, so we're waiting for a file called trigger.txt .
  • poke_interval=10 — check every 10 seconds
  • timeout=300 — give up after 5 minutes
  • mode="reschedule" — release the worker slot between pokes

Step 6: Add the downstream task and the DAG

def  process_file():
    print("Processing the file...") 

with DAG( 
    dag_id="sensor_dag", 
    start_date=datetime(2024,  1,  1), 
    schedule=None, 
    catchup=False )  as dag: 
    
        wait_for_file = FileSensor( task_id="wait_for_file",
        filepath="/opt/airflow/dags/trigger.txt", 
        poke_interval=10, 
        timeout=300, 
        mode="reschedule" ) 

        process = PythonOperator( 
        task_id="process", 
        python_callable=process_file ) 

        wait_for_file >> process 

Step 8: Trigger it and watch the sensor wait

Open the Airflow UI from the environment panel. Find sensor_dag on the DAGs page and trigger it manually.

Open the Graph view. The wait_for_file task turns yellow, it's running, poking every 10 seconds, but the file doesn't exist yet. The downstream task stays white because the sensor hasn't succeeded.

Click the sensor task → Logs. You'll see lines like:

Poking for file /opt/airflow/dags/trigger.txt

Step 9: Create the file and watch the sensor succeed

In the VS Code terminal run:

touch ~/airflow/dags/trigger.txt 

Within 10 seconds the sensor pokes one more time, finds the file, and turns green. Immediately after, process turns yellow then green.

After hibernation

If the VM hibernates, reconnect and run in the VS Code terminal:

bash cd ~/airflow 
docker compose up -d 

What's next

Now go and try this out in a live environment — boot a fresh cluster and play with the manifests above.

Start Airflow
Spec 2 CPU / 8 GiB ·Disk 20 GiB ·Lifetime 7 days
Sign in to launch this environment
Required 1 VM · 2 CPU · 8 GB
Your plan (free) 1 VM · 2 CPU · 4 GB
Sign in