AIRFLOW Sensors & external triggers in Airflow
Sometimes a pipeline has to wait for a file to arrive, an API to respond, another DAG to finish. Sensors are the way how Airflow waits.
What we're doing
You'll learn what sensors are, the difference between poke and reschedule modes, and build a DAG where a sensor waits for a file before the next task runs.
Step 1: Pipelines that depend on the outside world
A scheduled DAG fires at a fixed time. But what if your pipeline depends on something it doesn't control?
- A daily report can only run after the upstream team drops their CSV in S3 and they don't always hit the deadline
- A transform can only start after an external batch job finishes
- A processing task can only run after an API endpoint becomes ready
That's the job of a sensor.
Step 2: What sensors are
A sensor is a special kind of operator. Instead of doing work, it checks a condition repeatedly. If the condition is false, it waits. If it's true, the sensor succeeds and downstream tasks run. Three parameters every sensor has:
poke_interval— how often to check, in seconds. Default is 60 secondstimeout— how long to wait before giving up, in seconds. Default is 7 daysmode—pokeorreschedule(covered in the next step) Airflow ships with sensors for common conditions:FileSensor— waits for a file or folder to appearS3KeySensor— waits for a key in an S3 bucketHttpSensor— waits for an HTTP endpoint to return a successful responseExternalTaskSensor— waits for a task in another DAG to finish
Step 3: Poke vs reschedule mode
Poke mode (default) — the sensor holds onto its worker slot the entire time it's waiting. Even if it's just sleeping between pokes, the worker is reserved for it. Fine if the wait is short — a few seconds, maybe a minute.
Reschedule mode — between pokes the sensor releases the worker slot. Airflow puts the task back in the queue and another task can use the worker. When the next poke interval arrives, the sensor is picked back up.
The rule:
- Short wait (under a minute) → use
mode="poke" - Long wait (minutes, hours, days) → use
mode="reschedule"If you have 10 sensors in poke mode all waiting on slow files, they'll lock up 10 worker slots and your other tasks will starve. Reschedule mode prevents that.
Step 4: Create the DAG file
Click VS Code in the environment panel. Right click on the dags folder and create a new file called sensor.py.
Step 5: Add the imports and the sensor
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.sensors.filesystem import FileSensor
from datetime import datetime
Different sensors live in different modules; check the docs when you need a new one.
Now the sensor task:
wait_for_file = FileSensor(
task_id="wait_for_file", filepath="/opt/airflow/dags/trigger.txt",
poke_interval=10,
timeout=300,
mode="reschedule" )
filepath— the file the sensor is watching for. Inside the Docker container, the dags folder is mounted at/opt/airflow/dags, so we're waiting for a file calledtrigger.txt.poke_interval=10— check every 10 secondstimeout=300— give up after 5 minutesmode="reschedule"— release the worker slot between pokes
Step 6: Add the downstream task and the DAG
def process_file():
print("Processing the file...")
with DAG(
dag_id="sensor_dag",
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False ) as dag:
wait_for_file = FileSensor( task_id="wait_for_file",
filepath="/opt/airflow/dags/trigger.txt",
poke_interval=10,
timeout=300,
mode="reschedule" )
process = PythonOperator(
task_id="process",
python_callable=process_file )
wait_for_file >> process
Step 8: Trigger it and watch the sensor wait
Open the Airflow UI from the environment panel. Find sensor_dag on the DAGs page and trigger it manually.
Open the Graph view. The wait_for_file task turns yellow, it's running, poking every 10 seconds, but the file doesn't exist yet. The downstream task stays white because the sensor hasn't succeeded.
Click the sensor task → Logs. You'll see lines like:
Poking for file /opt/airflow/dags/trigger.txt
Step 9: Create the file and watch the sensor succeed
In the VS Code terminal run:
touch ~/airflow/dags/trigger.txt
Within 10 seconds the sensor pokes one more time, finds the file, and turns green. Immediately after, process turns yellow then green.
After hibernation
If the VM hibernates, reconnect and run in the VS Code terminal:
bash cd ~/airflow
docker compose up -d
What's next
Start Airflow