New machine management layer joins vCluster to deliver the industry’s first unified AI infrastructure stack, from bare metal to AI-ready environments
(NVIDIA GTC Booth 206) -- vCluster Labs today introduced vMetal, a new bare metal machine management layer designed to help Neocloud providers and AI factories provision and operate GPU infrastructure at scale. vMetal automates the lifecycle of bare metal GPU servers, from initial provisioning and machine assignment to upgrades and repurposing, allowing infrastructure operators to manage physical compute with cloud-like automation.
Together with vCluster’s tenant orchestration and vCluster Certified Stacks for AI environments, the vCluster Platform now delivers the industry’s first unified AI infrastructure stack, allowing organizations to run AI workloads from raw hardware to production-ready environments through a single platform.
“The profitability of Neoclouds directly rides on their ability to address two conflicting challenges. They need to squeeze the most money out of their absurdly expensive GPU hardware by optimally locating these GPUs across customers while at the same time ensuring a degree of isolation that goes far beyond using standard Kubernetes namespaces,” said Torsten Volk, Principal Analyst, Application Modernization, Omdia. “Filling the current gap between the physical hardware and the point where Kubernetes takes over directly impacts the efficiency and security of GPU allocation.”
As organizations race to build GPU clouds and AI factories, they face fragmentation across every layer of infrastructure, from hardware provisioning and machine lifecycle management to cluster orchestration, GPU scheduling, and AI tooling. The vCluster Platform addresses this challenge by introducing vMetal at the infrastructure foundation while expanding upward into certified AI environments.
Organizations can deploy this platform consistently across public cloud, private data centers, GPU Neocloud providers, and vCluster Standalone deployments, enabling a common operational model regardless of where the underlying infrastructure resides.
The AI Infrastructure Stack: From Bare Metal to AI Workloads
The vCluster Platform now delivers a complete AI infrastructure stack spanning physical machines, Kubernetes environments, and AI-ready application stacks.
vMetal: Bare Metal Machine Management
At the foundation of the stack is vMetal, which provides automated lifecycle management for bare metal GPU servers.
vMetal allows infrastructure operators to discover, provision, assign, upgrade, and repurpose physical machines through a centralized control plane. Hardware connected to the network can be automatically discovered and provisioned, then assigned directly to teams as dedicated bare metal machines or attached to Kubernetes clusters to provide elastic GPU capacity.
This capability enables Neocloud providers and AI factories to deliver GPU infrastructure with cloud-like automation while operating on their own hardware.
Through automated bare metal lifecycle management built directly into the vCluster Platform, servers become programmable infrastructure, allowing teams to provision, upgrade, repurpose, and decommission capacity using the same workflows across environments.
vCluster: Tenant and Cluster Orchestration
Above the physical infrastructure layer, vCluster provides tenant and cluster orchestration, enabling secure multi-tenant Kubernetes environments on shared GPU infrastructure.
Virtual clusters allow platform teams to isolate workloads, deliver self-service environments, and consolidate infrastructure while maintaining strong tenant boundaries.
vCluster Certified Stacks: AI-Ready Environments
At the application layer, vCluster Certified Stacks deliver production-ready AI environments that combine tenancy configuration, isolation policies, and AI tooling into a single deployable blueprint.
The first Certified Stacks focus on deep integration with NVIDIA Run:ai, supporting multiple tenancy models across both hard and soft isolation boundaries. Certified Stacks are delivered as maintained Terraform blueprints designed to be forked, extended, and customized to meet the needs of different platform teams.
“Running AI at scale requires more than GPUs — it requires a platform that can coordinate scheduling, isolation, and infrastructure across the entire stack,” said Omri Geller, Vice President at NVIDIA. “vCluster Certified Stacks allow organizations to deploy NVIDIA Run:ai as a production-grade AI platform using a multi-tenant architecture that provides strong tenant isolation.”
vCluster has been validated by NVIDIA as Run:ai conformant, ensuring organizations can deploy Run:ai-powered GPU orchestration within vCluster environments while maintaining the performance, fairness policies, and scheduling controls required for large-scale AI workloads.
The initial Certified Stacks also launch with support for open-source Slinky, with additional AI frameworks including Jupyter and SkyPilot planned for future releases.
Certified Stacks enable platform teams to deliver repeatable AI environments with governance and guardrails built in from day one, eliminating the need to assemble infrastructure and tooling manually for each team.
“Our vision has always been to make infrastructure as dynamic as the workloads it runs,” said Lukas Gentele, CEO of vCluster Labs. “With vMetal managing machines, vCluster orchestrating tenants, and Certified Stacks delivering AI environments, we’re giving infrastructure teams a single platform to run AI, from bare metal servers to production-ready AI environments.”
Availability
The new release is available now. For more information, visit: vmetal.ai
About vCluster
vCluster Labs is the leading platform for operating GPU infrastructure, enabling GPU clouds to deliver a hyperscaler-like experience to their customers and AI factories that need to build that same experience for their internal teams. Its technology delivers the full operational stack operators need to run their GPU data centers — managed Kubernetes, fast isolated tenant provisioning, and automated node provisioning and lifecycle management — enabling them to accelerate time to value, reduce operational burden, and maximize the ROI of every GPU. Trusted by fast-growing neoclouds and NVIDIA Cloud Partners, with an NVIDIA-validated reference architecture for DGX systems, vCluster helps operators turn GPU hardware into scalable AI factories. Outside of AI infrastructure, enterprises including FICO, GoFundMe, and Aussie Broadband use vCluster to deliver consistent, self-service Kubernetes platforms across multi-cloud and hybrid environments. Learn more at www.vcluster.com.
View source version on businesswire.com: https://www.businesswire.com/news/home/20260317651819/en/
vMetal automates the lifecycle of bare metal GPU servers, from initial provisioning and machine assignment to upgrades and repurposing, allowing infrastructure operators to manage physical compute with cloud-like automation.
Contacts
Media Contact:
Heather Fitzsimmons
Mindshare PR
heather@mindsharepr.com
650-279-4360
