<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/b349e071-aaf1-4d38-b257-9768bba63316/989f6f0a-5e42-4134-9798-3da9f35ded82/Apache_Airflow.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/b349e071-aaf1-4d38-b257-9768bba63316/989f6f0a-5e42-4134-9798-3da9f35ded82/Apache_Airflow.png" width="40px" />

What is Apache Airflow?

</aside>

Airflow is an orchestration tool. Apache Airflow is a tool for managing and automating workflows. It helps you schedule, organize, and monitor tasks like data processing or file transfers.

<aside> <img src="https://prod-files-secure.s3.us-west-2.amazonaws.com/b349e071-aaf1-4d38-b257-9768bba63316/2855e2b2-e104-40c7-bb9d-5d20d9404950/Apache_Airflow.png" alt="https://prod-files-secure.s3.us-west-2.amazonaws.com/b349e071-aaf1-4d38-b257-9768bba63316/2855e2b2-e104-40c7-bb9d-5d20d9404950/Apache_Airflow.png" width="40px" />

What is Workflow? (Workflow / Pipeline)

</aside>

A workflow is a series of steps or tasks that need to be completed to achieve a goal. In Apache Airflow, workflows are called DAGs (Directed Acyclic Graphs).

For example:

  1. Extract data from a database.
  2. Process or clean the data.
  3. Load the data into a report or another system.

Each step is a task, and the order of tasks defines the workflow. Workflows ensure tasks happen in the correct sequence.

<aside> <img src="notion://custom_emoji/b349e071-aaf1-4d38-b257-9768bba63316/1666d46d-dacb-80f1-92e1-007af320a33d" alt="notion://custom_emoji/b349e071-aaf1-4d38-b257-9768bba63316/1666d46d-dacb-80f1-92e1-007af320a33d" width="40px" />

What is DAG?

</aside>

A DAG (Directed Acyclic Graph) is the structure Airflow uses to define workflows.

basic-dag.png

Key points: