Scaling Horizontally with Multiple Arrays, One Backend Name
In Part 1, I covered the fundamentals of intent-based volume types: how to use capabilities and filters to let the Cinder scheduler match workload requirements with backend characteristics. We described storage infrastructure, defined workload personas, and let the scheduler do the matching.
But there’s a powerful scaling pattern we haven’t explored yet: what happens when you configure multiple identical arrays with the same volume_backend_name?
This seemingly simple configuration choice transforms the scheduler from a matchmaker into an intelligent load distributor. Let’s explore how.
The Pattern: Multiple Arrays, One Logical Pool
When you have multiple arrays with identical capabilities—same model, same performance tier, same feature set—you can treat them as a single logical pool by giving them the same volume_backend_name.
The scheduler then automatically load balances across them, using real-time metrics to pick the optimal array for each request.
Configuration Example
Here are three FlashArray//X arrays configured as a single performance tier:
[flasharray_x_01]
volume_backend_name = pure_performance # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.10
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
"array_model": "FlashArray//X",
"avg_latency_ms": 0.35,
"avg_throughput_mbps": 5200,
"dedupe_ratio": 4.5,
"max_volume_size_gb": 8000
}
[flasharray_x_02]
volume_backend_name = pure_performance # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.11
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
"array_model": "FlashArray//X",
"avg_latency_ms": 0.35,
"avg_throughput_mbps": 5200,
"dedupe_ratio": 4.5,
"max_volume_size_gb": 8000
}
[flasharray_x_03]
volume_backend_name = pure_performance # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.12
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
"array_model": "FlashArray//X",
"avg_latency_ms": 0.35,
"avg_throughput_mbps": 5200,
"dedupe_ratio": 4.5,
"max_volume_size_gb": 8000
}
Notice they all share volume_backend_name = pure_performance.
How the Scheduler Handles It
When a volume type requests volume_backend_name=pure_performance, all three arrays become candidates. The scheduler then:
- Applies filters – All three pass because they have identical static capabilities
- Invokes weighers – Evaluates each based on:
- Free capacity (
CapacityWeigher) - Current load via driver-reported metrics
- Goodness scores if configured
- Free capacity (
- Selects the best – Automatically picks the least-loaded or most suitable array
The key is that each array reports its own dynamic metrics independently.
Driver-Reported Metrics Enable Smart Distribution
Remember from Part 1 that the Pure Storage driver reports live performance metrics with every stats update. When you have multiple arrays with the same backend name, these metrics become the scheduler’s load balancing signals:
Workload metrics:
total_volumes– Number of volumes currently on the arraytotal_snapshots– Number of snapshots currently on the arraytotal_hosts– Number of hosts connected to the arraytotal_pgroups– Number of protection groups configured
Performance metrics:
writes_per_sec– Current write operations per secondreads_per_sec– Current read operations per secondinput_per_sec– Current input bandwidth (bytes/sec)output_per_sec– Current output bandwidth (bytes/sec)
Latency metrics:
usec_per_read_op– Average microseconds per read operationusec_per_write_op– Average microseconds per write operationqueue_usec_per_read_op– Average queue time per read (microseconds)queue_usec_per_write_op– Average queue time per write (microseconds)
Using Metrics for Load Distribution
You can use these metrics in your volume type to guide load distribution:
# Create a performance tier that prefers less-loaded arrays
openstack volume type create high-perf
openstack volume type set high-perf \
--property volume_backend_name=pure_performance \
--property capabilities:array_model="FlashArray//X" \
--property capabilities:total_volumes<=800 \
--property capabilities:writes_per_sec<=40000
Now when users create volumes with this type:
- All three arrays match the static capabilities
- Arrays with fewer than 800 volumes are preferred
- Arrays handling less than 40,000 writes/sec are preferred
- The scheduler picks the best candidate automatically
If all arrays exceed these thresholds, the scheduler falls back to CapacityWeigher, choosing the array with the most free space.
Benefits of This Pattern
Automatic Load Balancing
No manual intervention needed. The scheduler continuously evaluates real-time metrics:
total_volumesprevents any single array from becoming a hotspotwrites_per_secandreads_per_secbalance I/O workloadusec_per_read_opandusec_per_write_opavoid saturated arrays
Hot spots naturally get avoided as their metrics rise above thresholds.
Simplified Operations
Adding capacity is trivial:
- Create a new backend stanza with the same
volume_backend_name - Enable the backend
- The scheduler immediately includes it in the pool
No volume type changes. No user communication needed.
Removing arrays for maintenance:
- Disable the backend
- Existing volumes stay accessible
- New volumes go to remaining arrays
- Re-enable when maintenance completes
Transparent to Users
Users request “high-performance” storage. The scheduler picks the optimal array. Users never know or care which physical array they’re on—they just get consistent performance from the logical pool.
This is the essence of infrastructure abstraction.
When to Use the Same Backend Name
Use identical volume_backend_name values when:
✅ Arrays have the same capabilities and performance profile
- Same model (all FlashArray//X or all FlashArray//C)
- Same feature set (replication, thin provisioning, etc.)
- Same performance characteristics
✅ They’re in the same failure domain or availability zone
- Load balancing within a zone is desired
- No data sovereignty or residency constraints
✅ You want automatic load distribution
- No need for manual placement control
- Trust the scheduler to optimize
✅ Management and operational procedures are identical
- Same monitoring, same backup policies, same support tier
When to Use Different Backend Names
Use different volume_backend_name values when:
❌ Arrays have different capabilities
- FlashArray//X vs FlashArray//C (different performance tiers)
- Different maximum volume sizes
- Different feature support
❌ They’re in different failure domains
- Need explicit availability zone control
- Data residency or compliance requires isolation
❌ You need explicit control over placement
- Certain workloads must run on specific arrays
- Testing or validation requires predictable placement
❌ Compliance or data residency requires isolation
- Data must stay in specific data centers
- Regulatory requirements prevent mixing
Advanced Pattern: Combining with Availability Zones
You can combine this pattern with availability zones for zone-aware load balancing:
# Data Center 1
[flasharray_x_dc1_01]
volume_backend_name = pure_performance
backend_availability_zone = dc1
san_ip = 10.1.0.10
# ... rest of config
[flasharray_x_dc1_02]
volume_backend_name = pure_performance
backend_availability_zone = dc1
san_ip = 10.1.0.11
# ... rest of config
# Data Center 2
[flasharray_x_dc2_01]
volume_backend_name = pure_performance backend_availability_zone = dc2 san_ip = 10.2.0.10 # ... rest of config
[flasharray_x_dc2_02]
volume_backend_name = pure_performance
backend_availability_zone = dc2
san_ip = 10.2.0.11
# ... rest of config
Now the scheduler:
- Load-balances within each availability zone
- Respects zone placement requests from users
- Maintains availability zone isolation
Users can request specific zones:
openstack volume create --size 100 \
--type high-perf \
--availability-zone dc1 \
my-volume
The scheduler picks the best array within dc1 automatically.
Real-World Example: Growing from 2 to 6 Arrays
Let’s walk through a realistic growth scenario.
Starting State: Two Arrays
[flasharray_x_01]
volume_backend_name = pure_performance
# ... config
[flasharray_x_02]
volume_backend_name = pure_performance
# ... config
Volume type:
openstack volume type create high-perf
openstack volume type set high-perf \
--property volume_backend_name=pure_performance \
--property capabilities:array_model="FlashArray//X"
The scheduler load-balances between the two arrays.
Growth Phase 1: Adding Capacity
Business grows. You add two more arrays:
[flasharray_x_03]
volume_backend_name = pure_performance
# ... config
[flasharray_x_04]
volume_backend_name = pure_performance
# ... config
What changed:
- Configuration: Added two stanzas
- Volume types: Nothing
- User experience: Nothing
The scheduler now distributes across four arrays.
Growth Phase 2: Multi-Site Deployment
You expand to a second data center:
# DC1 arrays (existing)
[flasharray_x_dc1_01]
volume_backend_name = pure_performance
backend_availability_zone = dc1 # ... config
[flasharray_x_dc1_02]
volume_backend_name = pure_performance
backend_availability_zone = dc1
# ... config
# DC2 arrays (new)
[flasharray_x_dc2_01]
volume_backend_name = pure_performance
backend_availability_zone = dc2
# ... config
[flasharray_x_dc2_02]
volume_backend_name = pure_performance
backend_availability_zone = dc2
# ... config
What changed:
- Configuration: Added
backend_availability_zoneto all backends, added two DC2 stanzas - Volume types: Nothing (but users can now request specific zones)
- User experience: Optional zone selection now available
Users who don’t specify a zone get automatic distribution across all six arrays. Users who need zone affinity can request it explicitly.
What Didn’t Change
Through all this growth:
- Volume type definitions stayed stable
- No application code changes
- No user retraining
- No manual load balancing
The scheduler adapted automatically.
Observability: Seeing Load Distribution
To see how the scheduler is distributing load, use:
# See all pools and their stats
openstack volume service list --service cinder-volume
# See detailed pool stats
cinder get-pools --detail
# For each pool matching "pure_performance"
cinder get-pools --detail | grep -A 30 "pure_performance"
You’ll see metrics for each array:
total_capacity_gbfree_capacity_gballocated_capacity_gbtotal_volumes- Custom capabilities including driver-reported metrics
This gives you visibility into how evenly load is distributed.
Common Pitfalls and How to Avoid Them
Pitfall 1: Hidden Capacity Differences
Problem: Arrays have different usable capacity due to different data reduction ratios.
Solution: Monitor driver-reported dedupe_ratio and compression_ratio. If they diverge significantly, investigate.
Pitfall 2: Network Topology Matters
Problem: All arrays look identical in config, but some have better network paths to compute.
Solution: Either:
- Use availability zones to isolate by network topology
- Accept slightly uneven distribution as network routing does its job
- Use affinity/anti-affinity hints for latency-critical workloads
Pitfall 3: Forgetting to Update Volume Types
Problem: You add arrays but forget volume types still filter on metrics like total_volumes="< 500".
Solution: If using hard metric constraints, review them when scaling. Consider removing hard constraints and relying on weighers instead.
Why This Pattern Scales
This approach scales operationally because:
✅ Horizontal scaling is transparent
- Add arrays without touching volume types
- Remove arrays without user impact (if volumes can migrate)
- Capacity grows linearly
✅ The scheduler distributes load automatically
- No manual placement decisions
- No “which array should this go on?” questions
- Load balancing adjusts continuously as conditions change
✅ Operations stay simple
- One volume type serves many arrays
- Configuration is declarative and repeatable
- Troubleshooting focuses on capabilities, not individual arrays
✅ Users stay abstracted from infrastructure
- They select workload personas (“high-perf”, “large-capacity”)
- They never know which physical array serves them
- Infrastructure can change beneath them without disruption
Key Takeaways
The same volume_backend_name pattern enables:
- Automatic load balancing across identical arrays
- Transparent horizontal scaling
- Simplified volume type management
- Clean separation between user intent and infrastructure topology
Driver-reported metrics make it smart:
- Real-time load distribution based on actual array state
- Avoidance of hot spots and saturated arrays
- Automatic adaptation as conditions change
It works because the scheduler:
- Treats backend name as a logical pool identifier
- Evaluates all matching backends independently
- Uses metrics and weighers to pick the best candidate
- Makes deterministic, explainable, repeatable decisions
From Matchmaker to Load Balancer
In Part 1, I showed the Cinder scheduler as a quiet matchmaker, connecting workload intent with backend capabilities. In this post, I’ve shown it become something more: an intelligent load balancer that distributes work across a pool of equivalent resources.
The pattern is simple. The configuration is declarative. The results are automatic.
No manual intervention. No placement spreadsheets. No “array affinity” tickets.
Just infrastructure that scales horizontally while users keep requesting what they need, blissfully unaware of the expanding fleet of arrays serving them.
That’s the scheduler doing exactly what it was designed to do.
If you let it.
Volume Migration: The Final Frontier of Scaling
As environments grow, horizontal scaling introduces a new operational reality: eventually some volumes need to move, even when multiple arrays present themselves as a single logical tier. In Part 3, I shift from scheduling theory to hands-on operations and walk through safe, deterministic ways to migrate volumes between backends that share the same volume_backend_name, including when to use retype versus direct migration and what the Cinder feature matrix really means for storage-assisted moves.
