Apache Airflow Tutorial
Apache Airflow is a tool for authoring, monitoring & scheduling pipelines. As a result, it is an ideal case of ETL & MLOps pipelines. Examples Uses Cases: Extracting data from many sources, aggregating them, transforming them, and store in a data warehouse. Extract insights from data and display them in an analytics dashboard Train, validate, and deploy machine learning models Key Components: 1. WebServer : Webserver is Airflow’s user interface (UI), which allows you to interact with it without the need for a CLI or an API. From there one can execute, and monitor pipelines, create connections with external systems, inspect their datasets, and many more. 2. Schedular : The scheduler is responsible for executing different tasks at the correct time, re-running pipelines, backfilling data, ensuring tasks completion, etc. 3. Executors : Executors are the mechanism by which pipelines run. There are many different types that run pipelines locally, in a sin...