Installation guide

Kedro setup

First, you need to install base Kedro package in <17.0 version

Kedro 17.0 is supported by kedro-airflow-k8s, but not by kedro-mlflow yet, so the latest version from 0.16 family is recommended.

$ pip install 'kedro<0.17'

Plugin installation

Install from PyPI

You can install kedro-airflow-k8s plugin from PyPi with pip:

pip install --upgrade kedro-airflow-k8s

Install from sources

You may want to install the develop branch which has unreleased features:

pip install git+https://github.com/getindata/kedro-airflow-k8s.git@develop

Available commands

You can check available commands by going into project directory and runnning:

$ kedro airflow-k8s

Usage: kedro airflow-k8s [OPTIONS] COMMAND [ARGS]...

Options:
-e, --env TEXT  Environment to use.
-p, --pipeline TEXT  Pipeline name to pick.
-h, --help      Show this message and exit.

Commands:
  compile          Create an Airflow DAG for a project
  init             Initializes configuration for the plugin
  list-pipelines   List pipelines generated by this plugin
  run-once         Uploads pipeline to Airflow and runs once
  schedule         Uploads pipeline to Airflow with given schedule
  ui               Open Apache Airflow UI in new browser tab
  upload-pipeline  Uploads pipeline to Airflow DAG location

compile

compile command takes one argument, which is the directory name containing configuration (relative to conf folder). As an outcome, dag directory contains python file with generated DAG.

init

init command adds default plugin configuration to the project, based on Apache Airflow CLI input. It also allows optionally adding github actions, to streamline project build and upload.

list-pipelines

list-pipelines lists all pipelines generated by this plugin which exist in Airflow server. All generated DAGs are tagged with tag generated_with_kedro_airflow_k8s:$PLUGIN_VERSION and the prefix of this tag is used to distinguish among the other tags.

run-once

run-once command generates DAG from the pipeline, uploads it Airflow DAG location and triggers the DAG run as soon as the new DAG instance is available. It optionally allows waiting for DAG run completion, checking if success status is returned.

schedule

schedule command takes three arguments, one is the directory name containing configuration (relative to conf folder), the second one is the output location of generated dag, the third is cron like expression that relates to Airflow DAG schedule_interval.

ui

ui simplifies access to Apache Airflow console. It also allows open UI for the specific DAG.

upload-pipeline

upload-pipeline command takes two arguments, one is the directory name containing configuration (relative to conf folder), the second one is the output location of generated dag.