Airflow

What is Airflow? Apache airflow is an open-sources platform for developing, scheduling, and monitoring batch-oriented workflows.  In the cloud we will be using airflow to schedule task (i.e. python scripts) similar to what cron does on the T7 servers.  Airflow will be able to stage and run python 3 scripts in the CWBI environment.  In airflow these scripts are called DAGs.

Key Points:

  • Pure python platform to programmatically author, schedule and monitor workflows
  • Comes with a UI that provides full insight into status of completed and ongoing tasks
  • Originally developed by Airbnb now managed by open source community

Screen shots and documentation on airflow can be found here https://airflow.apache.org/

For a quick demo on Airflow, see the CWMS Data Workshop 2024 Introduction Presentation.  The Airflow Demo begins at 1:13:26 in the video.