Kubernetes v1.36 DRA: Smarter Resource Allocation with Priority, Taints, and Partitioning

Dynamic Resource Allocation (DRA) has fundamentally changed how platform administrators handle hardware accelerators and specialized resources in Kubernetes. In the v1.36 release, DRA continues to mature, bringing a wave of feature graduations, critical usability improvements, and new capabilities that extend the flexibility of DRA to native resources like memory and CPU, and support for ResourceClaims in PodGroups. Driver availability continues to expand. Beyond specialized compute accelerators, the ecosystem includes support for networking and other hardware types, reflecting a move toward a more robust, hardware-agnostic infrastructure. Whether you are managing massive fleets of GPUs, need better handling of failures, or simply looking for better ways to define resource fallback options, the upgrades to DRA in 1.36 have something for you.

Feature Graduations

The community has been hard at work stabilizing core DRA concepts. In Kubernetes 1.36, several highly anticipated features have graduated to Beta and Stable.

Kubernetes v1.36 DRA: Smarter Resource Allocation with Priority, Taints, and Partitioning

Prioritized List (Stable)

Hardware heterogeneity is a reality in most clusters. With the Prioritized list feature, you can confidently define fallback preferences when requesting devices. Instead of hardcoding a request for a specific device model, you can specify an ordered list of preferences — for example, "Give me an H100, but if none are available, fall back to an A100." The scheduler will evaluate these requests in order, drastically improving scheduling flexibility and cluster utilization. This stable feature is now production-ready, allowing administrators to maximize hardware usage while minimizing waste.

Extended Resource Support (Beta)

As DRA becomes the standard for resource allocation, bridging the gap with legacy systems is crucial. The Extended resource feature allows users to request resources via traditional extended resources on a Pod. This enables a gradual transition to DRA, meaning cluster operators can migrate clusters to DRA but let application developers adopt the ResourceClaim API on their own schedule. This beta feature provides a smooth migration path, reducing disruption for existing workflows.

Partitionable Devices (Beta)

Hardware accelerators are powerful, and sometimes a single workload doesn't need an entire device. The Partitionable devices feature provides native DRA support for dynamically carving physical hardware into smaller, logical instances (such as Multi-Instance GPUs) based on workload demands. This allows administrators to safely and efficiently share expensive accelerators across multiple Pods, improving overall cluster density and cost efficiency.

Device Taints (Beta)

Just as you can taint a Kubernetes Node, you can apply taints directly to specific DRA devices. Device taints and tolerations empower cluster administrators to manage hardware more effectively. You can taint faulty devices to prevent them from being allocated to standard claims, or reserve specific hardware for dedicated teams, specialized workloads, and experiments. Only Pods with matching tolerations are permitted to claim these tainted devices, giving you fine-grained control over hardware allocation.

Device Binding Conditions (Beta)

To improve scheduling reliability, the Device binding conditions feature introduces conditions that must be satisfied before a device is bound to a ResourceClaim. This allows administrators to ensure devices are in a proper state (e.g., powered on, not in maintenance) before allocation. The scheduler waits for these conditions to be met, reducing scheduling failures and improving workload stability. This beta feature is essential for environments where device readiness is not guaranteed.

Conclusion

Kubernetes v1.36 marks a significant step forward for Dynamic Resource Allocation. With the graduation of the prioritized list to stable and several critical features reaching beta — including extended resource support, partitionable devices, device taints, and device binding conditions — DRA becomes more flexible, reliable, and production-ready. As the driver ecosystem continues to expand, platform administrators now have the tools to manage heterogeneous hardware at scale with unprecedented control and efficiency. Whether you are optimizing GPU utilization or integrating with legacy resource models, the advances in DRA in v1.36 pave the way for the next era of resource management in Kubernetes.