Multiple Arrays, One Backend Name: Horizontal Scaling in OpenStack Cinder

Scaling Horizontally with Multiple Arrays, One Backend Name

In Part 1, I covered the fundamentals of intent-based volume types: how to use capabilities and filters to let the Cinder scheduler match workload requirements with backend characteristics. We described storage infrastructure, defined workload personas, and let the scheduler do the matching.

But there’s a powerful scaling pattern we haven’t explored yet: what happens when you configure multiple identical arrays with the same volume_backend_name?

This seemingly simple configuration choice transforms the scheduler from a matchmaker into an intelligent load distributor. Let’s explore how.

The Pattern: Multiple Arrays, One Logical Pool

When you have multiple arrays with identical capabilities—same model, same performance tier, same feature set—you can treat them as a single logical pool by giving them the same volume_backend_name.

The scheduler then automatically load balances across them, using real-time metrics to pick the optimal array for each request.

Configuration Example

Here are three FlashArray//X arrays configured as a single performance tier:

[flasharray_x_01]
volume_backend_name = pure_performance  # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.10
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
  "array_model": "FlashArray//X",
  "avg_latency_ms": 0.35,
  "avg_throughput_mbps": 5200,
  "dedupe_ratio": 4.5,
  "max_volume_size_gb": 8000
}

[flasharray_x_02]
volume_backend_name = pure_performance  # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.11
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
  "array_model": "FlashArray//X",
  "avg_latency_ms": 0.35,
  "avg_throughput_mbps": 5200,
  "dedupe_ratio": 4.5,
  "max_volume_size_gb": 8000
}

[flasharray_x_03]
volume_backend_name = pure_performance # Same name
volume_driver = cinder.volume.drivers.pure.PureISCSIDriver
san_ip = 10.1.0.12
pure_api_token = <secret>
use_chap_auth = False
extra_capabilities = {
  "array_model": "FlashArray//X",
  "avg_latency_ms": 0.35,
  "avg_throughput_mbps": 5200,
  "dedupe_ratio": 4.5,
  "max_volume_size_gb": 8000
}

Notice they all share volume_backend_name = pure_performance.

How the Scheduler Handles It

When a volume type requests volume_backend_name=pure_performance, all three arrays become candidates. The scheduler then:

  1. Applies filters – All three pass because they have identical static capabilities
  2. Invokes weighers – Evaluates each based on:
    • Free capacity (CapacityWeigher)
    • Current load via driver-reported metrics
    • Goodness scores if configured
  3. Selects the best – Automatically picks the least-loaded or most suitable array

The key is that each array reports its own dynamic metrics independently.

Driver-Reported Metrics Enable Smart Distribution

Remember from Part 1 that the Pure Storage driver reports live performance metrics with every stats update. When you have multiple arrays with the same backend name, these metrics become the scheduler’s load balancing signals:

Workload metrics:

  • total_volumes – Number of volumes currently on the array
  • total_snapshots – Number of snapshots currently on the array
  • total_hosts – Number of hosts connected to the array
  • total_pgroups – Number of protection groups configured

Performance metrics:

  • writes_per_sec – Current write operations per second
  • reads_per_sec – Current read operations per second
  • input_per_sec – Current input bandwidth (bytes/sec)
  • output_per_sec – Current output bandwidth (bytes/sec)

Latency metrics:

  • usec_per_read_op – Average microseconds per read operation
  • usec_per_write_op – Average microseconds per write operation
  • queue_usec_per_read_op – Average queue time per read (microseconds)
  • queue_usec_per_write_op – Average queue time per write (microseconds)

Using Metrics for Load Distribution

You can use these metrics in your volume type to guide load distribution:

# Create a performance tier that prefers less-loaded arrays
openstack volume type create high-perf
openstack volume type set high-perf \
  --property volume_backend_name=pure_performance \
  --property capabilities:array_model="FlashArray//X" \
  --property capabilities:total_volumes<=800 \
  --property capabilities:writes_per_sec<=40000

Now when users create volumes with this type:

  • All three arrays match the static capabilities
  • Arrays with fewer than 800 volumes are preferred
  • Arrays handling less than 40,000 writes/sec are preferred
  • The scheduler picks the best candidate automatically

If all arrays exceed these thresholds, the scheduler falls back to CapacityWeigher, choosing the array with the most free space.

Benefits of This Pattern

Automatic Load Balancing

No manual intervention needed. The scheduler continuously evaluates real-time metrics:

  • total_volumes prevents any single array from becoming a hotspot
  • writes_per_sec and reads_per_sec balance I/O workload
  • usec_per_read_op and usec_per_write_op avoid saturated arrays

Hot spots naturally get avoided as their metrics rise above thresholds.

Simplified Operations

Adding capacity is trivial:

  1. Create a new backend stanza with the same volume_backend_name
  2. Enable the backend
  3. The scheduler immediately includes it in the pool

No volume type changes. No user communication needed.

Removing arrays for maintenance:

  1. Disable the backend
  2. Existing volumes stay accessible
  3. New volumes go to remaining arrays
  4. Re-enable when maintenance completes

Transparent to Users

Users request “high-performance” storage. The scheduler picks the optimal array. Users never know or care which physical array they’re on—they just get consistent performance from the logical pool.

This is the essence of infrastructure abstraction.

When to Use the Same Backend Name

Use identical volume_backend_name values when:

Arrays have the same capabilities and performance profile

  • Same model (all FlashArray//X or all FlashArray//C)
  • Same feature set (replication, thin provisioning, etc.)
  • Same performance characteristics

They’re in the same failure domain or availability zone

  • Load balancing within a zone is desired
  • No data sovereignty or residency constraints

You want automatic load distribution

  • No need for manual placement control
  • Trust the scheduler to optimize

Management and operational procedures are identical

  • Same monitoring, same backup policies, same support tier

When to Use Different Backend Names

Use different volume_backend_name values when:

Arrays have different capabilities

  • FlashArray//X vs FlashArray//C (different performance tiers)
  • Different maximum volume sizes
  • Different feature support

They’re in different failure domains

  • Need explicit availability zone control
  • Data residency or compliance requires isolation

You need explicit control over placement

  • Certain workloads must run on specific arrays
  • Testing or validation requires predictable placement

Compliance or data residency requires isolation

  • Data must stay in specific data centers
  • Regulatory requirements prevent mixing

Advanced Pattern: Combining with Availability Zones

You can combine this pattern with availability zones for zone-aware load balancing:

# Data Center 1
[flasharray_x_dc1_01]
volume_backend_name = pure_performance
backend_availability_zone = dc1
san_ip = 10.1.0.10
# ... rest of config

[flasharray_x_dc1_02]
volume_backend_name = pure_performance
backend_availability_zone = dc1
san_ip = 10.1.0.11
# ... rest of config

# Data Center 2
[flasharray_x_dc2_01]
volume_backend_name = pure_performance backend_availability_zone = dc2 san_ip = 10.2.0.10 # ... rest of config

[flasharray_x_dc2_02]
volume_backend_name = pure_performance
backend_availability_zone = dc2
san_ip = 10.2.0.11
# ... rest of config

Now the scheduler:

  • Load-balances within each availability zone
  • Respects zone placement requests from users
  • Maintains availability zone isolation

Users can request specific zones:

openstack volume create --size 100 \
  --type high-perf \
  --availability-zone dc1 \
  my-volume

The scheduler picks the best array within dc1 automatically.

Real-World Example: Growing from 2 to 6 Arrays

Let’s walk through a realistic growth scenario.

Starting State: Two Arrays

[flasharray_x_01]
volume_backend_name = pure_performance
# ... config

[flasharray_x_02]
volume_backend_name = pure_performance
# ... config

Volume type:

openstack volume type create high-perf
openstack volume type set high-perf \
  --property volume_backend_name=pure_performance \
  --property capabilities:array_model="FlashArray//X"

The scheduler load-balances between the two arrays.

Growth Phase 1: Adding Capacity

Business grows. You add two more arrays:

[flasharray_x_03]
volume_backend_name = pure_performance
# ... config

[flasharray_x_04]
volume_backend_name = pure_performance
# ... config

What changed:

  • Configuration: Added two stanzas
  • Volume types: Nothing
  • User experience: Nothing

The scheduler now distributes across four arrays.

Growth Phase 2: Multi-Site Deployment

You expand to a second data center:

# DC1 arrays (existing)
[flasharray_x_dc1_01]
volume_backend_name = pure_performance
backend_availability_zone = dc1 # ... config

[flasharray_x_dc1_02]
volume_backend_name = pure_performance
backend_availability_zone = dc1
# ... config

# DC2 arrays (new)
[flasharray_x_dc2_01]
volume_backend_name = pure_performance
backend_availability_zone = dc2
# ... config

[flasharray_x_dc2_02]
volume_backend_name = pure_performance
backend_availability_zone = dc2
# ... config

What changed:

  • Configuration: Added backend_availability_zone to all backends, added two DC2 stanzas
  • Volume types: Nothing (but users can now request specific zones)
  • User experience: Optional zone selection now available

Users who don’t specify a zone get automatic distribution across all six arrays. Users who need zone affinity can request it explicitly.

What Didn’t Change

Through all this growth:

  • Volume type definitions stayed stable
  • No application code changes
  • No user retraining
  • No manual load balancing

The scheduler adapted automatically.

Observability: Seeing Load Distribution

To see how the scheduler is distributing load, use:

# See all pools and their stats
openstack volume service list --service cinder-volume

# See detailed pool stats
cinder get-pools --detail

# For each pool matching "pure_performance"
cinder get-pools --detail | grep -A 30 "pure_performance"

You’ll see metrics for each array:

  • total_capacity_gb
  • free_capacity_gb
  • allocated_capacity_gb
  • total_volumes
  • Custom capabilities including driver-reported metrics

This gives you visibility into how evenly load is distributed.

Common Pitfalls and How to Avoid Them

Pitfall 1: Hidden Capacity Differences

Problem: Arrays have different usable capacity due to different data reduction ratios.

Solution: Monitor driver-reported dedupe_ratio and compression_ratio. If they diverge significantly, investigate.

Pitfall 2: Network Topology Matters

Problem: All arrays look identical in config, but some have better network paths to compute.

Solution: Either:

  • Use availability zones to isolate by network topology
  • Accept slightly uneven distribution as network routing does its job
  • Use affinity/anti-affinity hints for latency-critical workloads

Pitfall 3: Forgetting to Update Volume Types

Problem: You add arrays but forget volume types still filter on metrics like total_volumes="< 500".

Solution: If using hard metric constraints, review them when scaling. Consider removing hard constraints and relying on weighers instead.

Why This Pattern Scales

This approach scales operationally because:

Horizontal scaling is transparent

  • Add arrays without touching volume types
  • Remove arrays without user impact (if volumes can migrate)
  • Capacity grows linearly

The scheduler distributes load automatically

  • No manual placement decisions
  • No “which array should this go on?” questions
  • Load balancing adjusts continuously as conditions change

Operations stay simple

  • One volume type serves many arrays
  • Configuration is declarative and repeatable
  • Troubleshooting focuses on capabilities, not individual arrays

Users stay abstracted from infrastructure

  • They select workload personas (“high-perf”, “large-capacity”)
  • They never know which physical array serves them
  • Infrastructure can change beneath them without disruption

Key Takeaways

The same volume_backend_name pattern enables:

  • Automatic load balancing across identical arrays
  • Transparent horizontal scaling
  • Simplified volume type management
  • Clean separation between user intent and infrastructure topology

Driver-reported metrics make it smart:

  • Real-time load distribution based on actual array state
  • Avoidance of hot spots and saturated arrays
  • Automatic adaptation as conditions change

It works because the scheduler:

  • Treats backend name as a logical pool identifier
  • Evaluates all matching backends independently
  • Uses metrics and weighers to pick the best candidate
  • Makes deterministic, explainable, repeatable decisions

From Matchmaker to Load Balancer

In Part 1, I showed the Cinder scheduler as a quiet matchmaker, connecting workload intent with backend capabilities. In this post, I’ve shown it become something more: an intelligent load balancer that distributes work across a pool of equivalent resources.

The pattern is simple. The configuration is declarative. The results are automatic.

No manual intervention. No placement spreadsheets. No “array affinity” tickets.

Just infrastructure that scales horizontally while users keep requesting what they need, blissfully unaware of the expanding fleet of arrays serving them.

That’s the scheduler doing exactly what it was designed to do.

If you let it.

Volume Migration: The Final Frontier of Scaling

As environments grow, horizontal scaling introduces a new operational reality: eventually some volumes need to move, even when multiple arrays present themselves as a single logical tier. In Part 3, I shift from scheduling theory to hands-on operations and walk through safe, deterministic ways to migrate volumes between backends that share the same volume_backend_name, including when to use retype versus direct migration and what the Cinder feature matrix really means for storage-assisted moves.

Leave a Reply

Your email address will not be published. Required fields are marked *