Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run jupyter notebook in airflow

My code is written in jupyter and saved as .ipynb format.

We want to use airflow to schedule the execution and define the dependencies.

How can the notebooks be executed in airflow?

I know I can convert them to python files first but the graphs generated on the fly will be difficult to handle.

Is there are any easier solution? Thanks

like image 695
Icarus Avatar asked Jul 28 '18 17:07

Icarus


1 Answers

Another alternative is to use Ploomner (disclaimer: I'm the author). It uses papermill under the hood to build multi-stage pipelines. Tasks can be notebooks, scripts, functions, or any combination of them. You can run locally, Airflow, or Kubernetes (using Argo workflows).

This is how a pipeline declaration looks like:

tasks:
  - source: notebook.ipynb
    product:
      nb: output.html
      data: output.csv

  - source: another.ipynb
    product:
      nb: another.html
      data: another.csv
  • Repository
  • Exporting to Airflow
  • Exporting to Kubernetes
  • Sample pipelines
like image 111
Edu Avatar answered Oct 10 '22 17:10

Edu