Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubeflow, passing Python dataframe across components?

I am writing a Kubeflow component which reads an input query and creates a dataframe, roughly as:

from kfp.v2.dsl import component 

@component(...)
def read_and_write():
    # read the input query 
    # transform to dataframe 
    sql.to_dataframe()

I was wondering how I can pass this dataframe to the next operation in my Kubeflow pipeline. Is this possible? Or do I have to save the dataframe in a csv or other formats and then pass the output path of this? Thank you

like image 309
FrankNrg92 Avatar asked Sep 21 '25 08:09

FrankNrg92


1 Answers

You need to use the concept of the Artifact. Quoting:

Artifacts represent large or complex data structures like datasets or models, and are passed into components as a reference to a file path.

like image 52
Theofilos Papapanagiotou Avatar answered Sep 22 '25 21:09

Theofilos Papapanagiotou