Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mounting an existing volume from a Kubeflow Pipeline: kfp.VolumeOP creates a new volume instead of creating a PVC to existing volume

I am new to KubeFlow and trying to port / adapt an existing solution to run in KubeFlow pipelines. The issue I am solving now is that the existing solution shared data via a mounted volume. I know this is not the best practice for components exchanging data in KubeFlow however this will be a temporary proof of concept and I have no other choice.

I am facing issues with accessing an existing Volume from the pipeline. I am basically running the code from KubeFlow documentation here, but pointing to an existing K8S Vo

def volume_op_dag():
vop = dsl.VolumeOp(
    name="shared-cache",
    resource_name="shared-cache",
    size="5Gi",
    modes=dsl.VOLUME_MODE_RWO
)

The Volume shared-cache exists:

enter image description here

However when I run the pipeline a new volume is created:

enter image description here

What am I doing wrong? I obviously don't want to create a new volume every time I run the pipeline but instead mount an existing one.

Edit: Adding KubeFlow versions:

  • kfp (1.8.13)
  • kfp-pipeline-spec (0.1.16)
  • kfp-server-api (1.8.3)
like image 342
user3197263 Avatar asked Dec 18 '25 18:12

user3197263


2 Answers

Have a look at the function kfp.onperm.mount_pvc. You can find values for the arguments pvc_name and volume_name via the console command kubectl -n <your-namespace> get pvc. The way you use it is by writing the component as if the volume is already mounted and following the example from the doc when binding it in the pipeline:

train = train_op(...)
train.apply(mount_pvc('claim-name', 'pipeline', '/mnt/pipeline'))

Also note, that both the volume and the pipeline must be in the same namespace.

like image 149
Alan Avatar answered Dec 23 '25 22:12

Alan


You can use already existing volume using the following step:

 volume_name = 'already_existing_volume name'

#Instead of using this (which every time creates a new volume),
 task = create_step_prepare_data().add_pvolumes({data_path: vop.volume})

# use this (just by adding dsl.PipelineVolume(pvc=volume_name))
 task = create_step_prepare_data().add_pvolumes({data_path: dsl.PipelineVolume(pvc=volume_name)})
like image 23
muss Avatar answered Dec 24 '25 00:12

muss



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!