Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Private cloud GPU virtualization similar to Amazon Web Services Cluster GPU instances

I am searching for options that enable dynamic cloud-based NVIDIA GPU virtualization similar to the way AWS assigns GPUs for Cluster GPU Instances.

My project is working on standing up an internal cloud. One requirement is the ability to allocate GPUs to virtual-machines/instances for server-side CUDA processing.

USC appears to be working on OpenStack enhancements to support this but it isn't ready yet. This would be exactly what I am looking for if it were fully functional in OpenStack.

NVIDIA VGX seems to only support allocation of GPUs to USMs, which is strictly remote-desktop GPU virtualization. If I am wrong, and VGX does enable server-side CUDA computing from virtual-machines/instances then please let me know.

like image 822
Bob B Avatar asked Jan 24 '13 16:01

Bob B


People also ask

What GPU does EC2 use?

Amazon EC2 G4 instances feature NVIDIA T4 Tensor Core GPUs, providing access to one GPU or multiple GPUs, with different amounts of vCPU and memory. G4 instances provide the industry's most cost-effective and versatile GPU instance for deploying ML models in production and graphics-intensive applications.

What are GPU based instances?

GPU instances. GPU-based instances provide access to NVIDIA GPUs with thousands of compute cores. You can use these instances to accelerate scientific, engineering, and rendering applications by leveraging the CUDA or Open Computing Language (OpenCL) parallel computing frameworks.

What is GPU in cloud computing?

A Graphics Processing Unit (GPU) is a specialized electronic circuit. Compared to your standard computer with its Central Processing Unit (CPU), a GPU has a parallel structure that offers faster computing and increased efficiency.


1 Answers

"dynamic cloud-based NVIDIA GPU virtualization similar to the way AWS assigns GPUs for Cluster GPU Instances."

AWS does not really allocate GPUs dynamically: Each GPU Cluster Compute has 2 fixed GPUs. All other servers (including the regular Cluster Compute) don't have any GPUs. I.e. they don't have an API where you can say "GPU or not", it's fixed to the box type, which uses fixed hardware.

The pass-thru mode on Xen was made specifically for your use case: Passing hardware on thru from the Host to the Guest. It's not 'dynamic' by default, but you could write some code that chooses one of the guests to get each card on the host.

like image 57
BraveNewCurrency Avatar answered Sep 17 '22 17:09

BraveNewCurrency