Migrate Cinder Volumes with Snapshots Between FlashArrays

Cinder refuses to migrate a volume that has snapshots — full stop. Here is a practical Ansible-based workaround for Everpure FlashArray environments that preserves your snapshot history as independently recoverable volumes, keeps the workload attached throughout, and leaves a manifest you can use months later to restore to any point in time.


The problem

You have a Cinder volume on a Everpure FlashArray backend. It has snapshots — nightly backups, pre-patching checkpoints, whatever they are. You need to move the volume to a different FlashArray backend, both of which sit in the same OpenStack cluster.

You try cinder migrate. Cinder refuses: the volume has snapshots. You try openstack volume set --type <target> --migration-policy on-demand. Same result. Cinder will not touch a volume that has dependent snapshots, regardless of how you ask.

The naive fix — delete the snapshots, migrate, recreate them — destroys your point-in-time history. That is unacceptable for most production workloads.

Why FlashArray changes the calculus

Snapshots are already independent copies

When you create a volume from a Cinder snapshot on a Everpure FlashArray backend (openstack volume create --snapshot <id>), Everpure performs an array-level volume copy. The result is fully independent from the parent — there is no thin-clone dependency chain, no COW metadata to worry about. That clone can be migrated, deleted, or snapshotted again without touching the original.

On LVM or Ceph this approach would carry risk because of chain dependencies. On FlashArray, it is clean.

Live migration via retype

The Everpure FlashArray Cinder driver supports live volume migration through retype --migration-policy on-demand. Cinder instructs the driver to perform an array-level copy to the target FlashArray, then remaps the host’s iSCSI/FC connections to the new volume. The workload stays attached throughout with a brief I/O redirect at cutover.

This means no maintenance window is required for the source volume migration — a significant advantage over solutions that require detaching.

The Everpure FlashArray Cinder driver supports live volume migration through retype --migration-policy on-demand. Cinder instructs the driver to perform an array-level copy to the target FlashArray, then remaps the host’s iSCSI/FC connections to the new volume. The workload stays attached throughout with a brief I/O redirect at cutover.

This means no maintenance window is required for the source volume migration — a significant advantage over solutions that require detaching.

RETYPE, NOT MIGRATE
The correct primitive for in-use, cross-backend moves within the same cluster is os-retype with migration_policy: on-demand — not cinder migrate, which requires the volume to be detached. The playbook calls this directly via the Cinder v3 REST API rather than shelling out to the OpenStack CLI — see the implementation section below.

The approach

The strategy has five phases, executed serially by an Ansible playbook:

1 Preflight validation
Assert the source volume exists, all snapshots are in available state, the target volume type exists, and estimate peak quota consumption. Supports a dry-run mode that exits here.

2 Clone each snapshot into an independent volume
For each snapshot, oldest first: create a new Cinder volume from the snapshot. The volume name encodes the source volume name and original snapshot name for easy identification.

3 Retype each clone to the target backend
Migrate each clone volume to the target FlashArray via retype on-demand. Tag each clone with Cinder metadata preserving its full lineage: parent volume, original snapshot ID, creation timestamp, and a recovery hint.

4 Delete original snapshots, retype source volume
Once all clones are on the target backend, delete the original snapshots from the source volume. This unblocks the retype. The playbook polls until zero snapshots remain before proceeding. Then retype the source volume — live, no detach required.

5 Write the recovery manifest
A JSON manifest is written to disk mapping every original snapshot UUID to its corresponding clone volume on the target backend. This is your long-term recovery map.

Volume naming

Every snapshot clone needs a name that makes its origin immediately obvious without consulting the manifest. The playbook uses a configurable template:

TEMPLATE    mig-{{ source_volume_name }}-snap-{{ snap_name }}
EXAMPLE     mig-my-db-vol-snap-nightly-2024-01-15
FALLBACK    mig-my-db-vol-snap-f9e8d7c6 (unnamed snapshot → first 8 chars of UUID)

Available template variables include the full or short (8-char) versions of both the source volume UUID and the snapshot UUID, so you can construct names suited to your environment’s conventions. The template is set in group_vars/all.yml or overridden at runtime with -e.

Playbook structure

├── migrate_volume.yml     ← entry point 
├── requirements.yml       ← openstack.cloud collection pin
├── clouds.yaml.example    ← auth template; copy to clouds.yaml and configure
├── clouds.yaml            ← your credentials (created from example; do not commit)
├── README.md              ← usage and recovery guide
├── group_vars
│   └── all.yml            ← tuneable defaults incl. os_cloud profile name
├── inventory
│   └── hosts              ← localhost only (ansible_connection=local)
└── roles
    ├── common
    │    └── tasks
    │       ├── obtain_auth.yml     ← reads clouds.yaml, POSTs to Keystone API
    │       └── retype_volume.yml         ← uri POST os-retype + uri GET poll
    │       └── get_volume_by_id.yml      ← uri GET /volumes/{id} (UUID lookup)
    │       └──  get_snapshots_by_volume.yml ← uri GET /snapshots?volume_id=
    │       └── set_volume_metadata.yml   ← uri POST /volumes/{id}/metadata
    │       └── delete_snapshot.yml       ← uri DELETE /snapshots/{id}
    ├── migrate_snapshots
    │   └── tasks
    │       ├── main.yml                  ← serial snapshot loop + deletion
    │       └── process_one_snapshot.yml  ← clone, retype, tag, manifest append
    ├── migrate_volume
    │   └── tasks
    │       └── main.yml   ← source volume retype
    ├── post_migration
    │   └── tasks
    │       └── main.yml   ← write manifest, print recovery guide
    └── preflight
        └── tasks
            └── main.yml   ← validate inputs, obtain auth, dry-run gate

The snapshot processing is extracted into a separate process_one_snapshot.yml task file included per loop iteration, keeping each step readable and the serial execution order explicit. The common role holds two shared task files used by both retype callers — explained in detail below.

Idempotency

The playbook is safe to re-run after a partial failure. Before creating a clone it checks whether a volume with that name already exists. Before retyping it checks whether the volume is already on the target type. Snapshot deletion is naturally idempotent. The manifest is overwritten on each successful completion.

Implementation — collection modules and direct REST

The playbook has no dependency on the OpenStack CLI (python-openstackclient). It uses the openstack.cloud collection where modules genuinely fit, and calls the Cinder v3 REST API directly via ansible.builtin.uri everywhere else — which in practice turns out to be most operations. The reason is a fundamental limitation of the collection: most modules filter resources by display name, not by UUID, making them unsuitable for the majority of this playbook’s operations.

What the collection covers

In practice, most openstack.cloud collection modules filter by display name — they do not accept UUIDs. This means nearly every operation that involves a volume or snapshot ID must go through the Cinder REST API directly. The collection is used only where it genuinely fits: openstack.cloud.volume to create a clone from a snapshot (by display name), and openstack.cloud.volume_info to poll the clone’s status while it becomes available (also by display name, since the playbook controls that name). Everything else — UUID-based lookups, snapshot queries, metadata writes, deletes, and type validation — uses ansible.builtin.uri against the Cinder v3 REST API.

Authentication — reading clouds.yaml directly

Rather than relying on openstack.cloud.auth — whose return structure varies across collection versions — the playbook reads clouds.yaml directly and calls the Keystone tokens API using ansible.builtin.uri:

- name: Read clouds.yaml
  ansible.builtin.set_fact:
    _clouds_config: "{{ lookup('file', playbook_dir + '/clouds.yaml') | from_yaml }}"

- name: POST to Keystone v3/auth/tokens
  ansible.builtin.uri:
    url: "{{ _keystone_base }}/v3/auth/tokens"
    method: POST
    body_format: json
    body: "{{ _keystone_auth_body }}"
    status_code: 201
    no_log: true
  register: _keystone_response

- name: Extract token and Cinder endpoint
  ansible.builtin.set_fact:
    os_auth_token: "{{ _keystone_response.x_subject_token }}"
    cinder_endpoint: "{{ ... }}/{{ project_id }}"

A few details that matter in practice: the auth_url in clouds.yaml may or may not include a trailing /v3 — the playbook normalises it with regex_replace before appending /v3/auth/tokens. The service catalog for this environment registers Cinder under the type block-storage rather than the older volumev3. The endpoint extracted from the catalog also omits the project ID segment, so the playbook injects it from the token’s project claim. Both password and application credential auth types from clouds.yaml are supported.

The REST API surface used

Six shared task files in the common role encapsulate every REST operation, each with a clear input/output contract:

obtain_auth.yml         POST /v3/auth/tokens   → os_auth_token, cinder_endpoint
get_volume_by_id.yml    GET  /volumes/{id}     → _fetched_volume
get_snapshots_by_volume GET  /snapshots?volume_id=  → _fetched_snapshots
retype_volume.yml       POST /volumes/{id}/action  → _retyped_volume
                        GET  /volumes/{id}  (poll)
set_volume_metadata.yml POST /volumes/{id}/metadata
delete_snapshot.yml     DELETE /snapshots/{id}

Volume type validation in preflight uses GET /types?name= and pins to an exact name match with selectattr, since the Cinder name query parameter may perform a substring match on some versions.

NO PYTHON-OPENSTACKCLIENT REQUIRED
The only runtime dependencies are Ansible, the openstack.cloud collection (≥ 2.1.0), and openstacksdk. The OpenStack CLI is not needed on the control node. If it is already installed, it will not be used.

The migration manifest

After a successful run a JSON file is written alongside the playbook:

{
  "source_volume_id": "a1b2c3d4-...",
  "source_volume_name": "my-db-vol",
  "target_volume_type": "flasharray-b",
  "source_volume_migrated": true,
  "snapshot_clones": [
    {
      "sequence": 1,
      "original_snapshot_name": "nightly-2024-01-15",
      "original_snapshot_created_at": "2024-01-15T02:00:00Z",
      "clone_volume_id": "11223344-...",
      "clone_volume_name": "mig-my-db-vol-snap-nightly-2024-01-15",
      "clone_size_gb": 200
    }
  ]
}

Each clone also carries its lineage directly as Cinder volume metadata, including a recovery_hint field with the exact command to create a restore volume — so the information survives even if you lose the manifest file.

Recovery procedure

To restore the source volume to the state captured in a specific snapshot, find the corresponding clone volume in the manifest and:

Create a restore volume from the clone
openstack volume create \
  --source mig-my-db-vol-snap-nightly-2024-01-15 \
  --type flasharray-b \
  --name my-db-vol-restored-jan15
Live-swap on a running instance
openstack server remove volume <server_id> <current_volume_id>
openstack server add volume    <server_id> <restored_volume_id>

Clone volumes are independent on the FlashArray — they can be kept indefinitely as long-term restore points or deleted once the retention window has passed. The source volume is untouched throughout.

Practical considerations

ConsiderationDetailStatus
Volume stays attachedPure driver handles live retype. No detach, no maintenance window.handled
Peak quota consumptionAt peak: source vol (source array) + all clones + source vol copy (target array). Plan for ~2× source size + sum of snapshot sizes.plan for it
Snapshot UUID referencesOriginal snapshot UUIDs are gone after deletion. External systems that reference them by UUID (backup catalogues, compliance tools) need updating. The manifest provides the mapping.plan for it
Chained snapshot orderingPlaybook processes snapshots oldest-first. On FlashArray this is safe since clones are independent, but ordering ensures the sequence in the manifest is chronologically meaningful.handled
Retype timeout for large volumesDefault timeout is 30 minutes per retype. For volumes over ~2 TB on a loaded array, increase retype_timeout in group_vars/all.yml.configurable
clouds.yaml project scopeThe credentials in clouds.yaml must be scoped to the same project that owns the source volume. A token scoped to a different project will return 404 on every volume lookup even if the credentials are valid.verify first
Cinder service catalog typeModern OpenStack deployments register Cinder as block-storage in the service catalog, not volumev3. The playbook checks for both, but if endpoint resolution fails, inspect your catalog with openstack catalog list.handled

Running the playbook

1. Copy and configure clouds.yaml
cp clouds.yaml.example clouds.yaml
# edit clouds.yaml — set auth_url, project_name, username, password
# the profile name under 'clouds:' must match os_cloud in group_vars/all.yml
2. Install dependencies (no OpenStack CLI required)

You could install the python3-openstacksdk package instead of using pip install.

ansible-galaxy collection install -r requirements.yml
pip install openstacksdk
3. Dry run — validate without making changes
ansible-playbook migrate_volume.yml \
  -i inventory/hosts \
  -e "source_volume_id=<uuid>" \
  -e "target_volume_type=flasharray-b" \
  -e "dry_run=true"
4. Execute migration
ansible-playbook migrate_volume.yml \
  -i inventory/hosts \
  -e "source_volume_id=<uuid>" \
  -e "target_volume_type=flasharray-b"
5. Custom clone naming (optional)
ansible-playbook migrate_volume.yml \
  -i inventory/hosts \
  -e "source_volume_id=<uuid>" \
  -e "target_volume_type=flasharray-b" \
  -e "clone_name_template=restore-{{ source_volume_name }}-{{ snap_id_short }}"

The full playbook is available on GitHub — clone it, drop in your clouds.yaml, and you are ready to run.

Full walkthrough: migrating a 1 GB volume with two snapshots from puredriver-1 to puredriver-2 on a DevStack environment.

See it in action

The following walkthrough runs the playbook end-to-end against a live OpenStack environment — preflight validation, snapshot cloning, source volume retype, and the resulting recovery manifest.

Limitations and what this is not

This approach does not restore the original snapshot UUIDs or the parent-snapshot relationship visible in the Cinder API. After the migration openstack volume snapshot list --volume <id> returns empty — there are no Cinder snapshot objects on the migrated volume. The history is preserved as data in independent volumes, not as Cinder snapshot metadata.

If your tooling depends on snapshot UUIDs being stable (backup agents, compliance systems, snapshot-based replication), you will need to update those references using the manifest. This is a trade-off inherent in the approach; Cinder currently provides no mechanism to re-attach an existing volume as a snapshot of another volume.

For environments where true snapshot-chain preservation is a hard requirement, the correct path is a storage-level migration coordinated directly with Everpure’s array replication features, outside the OpenStack control plane.

Portability to other Cinder backends

The playbook’s orchestration logic is pure Cinder API — nothing calls any Everpure-specific endpoint. In principle it should work against any backend that supports volume retype, but three backend-specific behaviours determine whether it will work safely in practice.

Retype support. os-retype with migration_policy: on-demand requires the backend driver to implement live migration. Most enterprise drivers support this — LVM, Ceph RBD, NetApp, Dell, HPE — but it is not universal. If the driver does not support it, Cinder will return a 400 and the playbook will fail cleanly at the assertion. Check your driver’s documentation before assuming support.

In-use retype. The playbook assumes volumes can be retyped while attached. Everpure FlashArray supports this. LVM does not — it requires the volume to be in available status. Ceph RBD and most enterprise SAN drivers support it, but behaviour varies by configuration. If your backend requires detach, you will need to quiesce the workload before running the playbook and accept a brief outage window.

Snapshot clone independence. This is the most critical portability risk. On Everpure FlashArray, create volume from snapshot produces a fully independent copy at the array level — the clone has no ongoing dependency on the source snapshot, so deleting the snapshot afterwards is safe. On LVM and Ceph RBD the clone retains a parent dependency: LVM thin clones share the snapshot’s copy-on-write chain, and Ceph RBD clones reference the parent snapshot until explicitly flattened. Deleting the source snapshot before flattening the clone on these backends will corrupt the clone. The playbook does not currently perform a flatten step. If you are running against LVM or Ceph, add a POST /volumes/{id}/action with body os-extend or use the rbd flatten equivalent before the snapshot deletion phase.

In short: the playbook is safe to use as-is against Everpure FlashArray and other backends where snapshot clones are immediately independent. For LVM or Ceph, treat it as a starting point that requires a flatten step before it can be run safely.

One thought on “Migrate Cinder Volumes with Snapshots Between FlashArrays

  1. This is a really helpful solution. I’ve run into similar issues with snapshot migration and it’s great to see a well-documented Ansible approach.

Leave a Reply

Your email address will not be published. Required fields are marked *