Skip to content

PureStorage-OpenConnect/lvm-xcopy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lvm-xcopy

Clone an LVM logical volume by offloading the data movement to the storage array via SCSI EXTENDED COPY (XCOPY / LID1) or NVMe Copy (opcode 0x19), so the bytes never traverse the host. Built and tested against Pure FlashArray LUNs presented to Linux via device-mapper-multipath (SCSI multipath) and nvme-tcp namespaces.

How it works

  1. Inspect the source LV with lvs --segments and resolve each segment to a (PV, PE start, PE count) tuple.
  2. Pick a destination extent range in the target VG (defaults to the tail of the chosen PV) and lvcreate the destination LV.
  3. For each source segment, translate the PE range into an LBA range on the underlying device and issue the appropriate offload CDB:
    • EXTENDED COPY (0x83) via the Linux SG_IO ioctl when the PV is a SCSI multipath device (/dev/mapper/<wwid>).
    • NVMe Copy (0x19) via NVME_IOCTL_IO_CMD when the PV is an NVMe namespace (/dev/nvme*n*). Same-namespace copies use Format 0h; cross-namespace copies (different NSIDs in the same NVM subsystem) use Format 2h with the source NSID carried in the descriptor.
  4. On failure the destination LV is removed (unless --keep-on-failure is passed).

Three back-end drivers are available:

Driver Transport Submission path Segments / CDB Notes
sgio SCSI native ctypes + SG_IO ioctl, single long-lived process 2 (Pure enforces this despite advertising 32) fastest SCSI path; recommended
nvme NVMe native ctypes + NVME_IOCTL_IO_CMD, single long-lived process 1 source range / CDB on Pure (MSRC=0); F0h same-ns and F2h cross-ns required for NVMe namespaces
ddpt SCSI spawns ddpt --xcopy once per 16-MiB chunk 1 fallback, ~0.5 ms/spawn + 15 ms/spawn overhead adds up on large clones

The default --driver=auto classifies each PV by device path and dispatches sgio for SCSI (/dev/mapper, /dev/sd*) or nvme for NVMe namespaces.

Cross-namespace NVMe Copy (Format 2h)

NVMe 2.0 gates each Copy descriptor format behind the Copy Descriptor Formats Enable (CDFE) field of the controller's Host Behavior Support feature (FID 0x16). The Linux NVMe driver does not program CDFE on attach, so a fresh boot leaves F2h disabled and the controller answers F2h CDBs with status 0x4002 (Invalid Field in Command). When lvm-xcopy detects a cross-namespace copy it issues the required Set Features automatically against the controller char device (/dev/nvmeN), so cross-VG clones over NVMe-TCP work without any external setup.

A 1 TiB intra-array clone that would take ~15 minutes through ddpt completes in roughly ~16 seconds with the native SG_IO driver (measured ~55 GiB/s effective at 64 GiB in our test environment).

Requirements

  • Linux with device-mapper-multipath (SCSI) and/or nvme-tcp (NVMe); tested on Proxmox VE / Debian.
  • LVM2 tools: vgs, lvs, pvs, lvcreate, lvremove.
  • Python 3.9+.
  • Root privileges (EXTENDED COPY requires O_RDWR on the SCSI generic path; NVMe Copy and the CDFE handshake require O_RDWR on the namespace and controller char devices).
  • For --driver=ddpt: the ddpt package (apt install ddpt).

No third-party Python dependencies — sgio.py uses only the standard library.

Installation

git clone https://github.com/<you>/lvm-xcopy.git
cd lvm-xcopy
python3 -m pip install .

Or run directly from a checkout without installing:

PYTHONPATH=src python3 -m lvm_xcopy --help

Usage

lvm-xcopy clone <source> <dest> [options]
  • <source> — source LV as VG/LV or /dev/VG/LV.
  • <dest> — destination LV as LV (same VG as source) or VG/LV.

Common options

Flag Default Purpose
--size SIZE source size Destination size. Accepts B, K/KiB, M/MiB, G/GiB, T/TiB. Rounded up to the VG extent size.
--alloc {tail,normal} tail Where to place the new LV. tail picks the last free range on the source PV (intra-VG) or the first PV of the destination VG (inter-VG); normal lets LVM choose.
--mode {pv,lv} pv Issue the offload CDB against the underlying PV device with extent-based LBA offsets, or against /dev/VG/LV directly. pv is required for real array offload.
--driver {auto,sgio,nvme,ddpt} auto Offload back-end (see table above). auto picks sgio for SCSI multipath PVs and nvme for NVMe-namespace PVs.
--bs BYTES 512 Logical block size.
--bpt BLOCKS 32768 Blocks per transfer (ddpt only; 32768 × 512 B = 16 MiB).
--id-usage {0,1,2,3} 3 LIST ID USAGE field. Pure FlashArray requires 3; other values fail with ASC 26h.
--force off Copy even if the source LV is active. Freeze the filesystem first.
--keep-on-failure off Do not lvremove the destination if the copy fails.
--dry-run off Print the lvcreate and XCOPY commands without running them.
-v, --verbose off Repeatable; increases driver verbosity (CDB count, per-segment progress, etc.).

Examples

Intra-array clone inside the same VG (same LUN, same array):

sudo lvm-xcopy clone vg_data/src vg_data/src_clone -v

Inter-VG clone across two LUNs on the same array:

sudo lvm-xcopy clone vg_data/src vg_backup/src_clone -v

Cross-VG clone across two NVMe namespaces in the same NVM subsystem (offloads as NVMe Copy Format 2h; CDFE is enabled automatically):

sudo lvm-xcopy clone nvme_vg_a/src nvme_vg_b/src_clone -v

Resize the destination while cloning (rounded up to VG extent size):

sudo lvm-xcopy clone vg_data/src vg_data/src_bigger --size 500G

Dry-run to preview the plan without modifying anything:

lvm-xcopy clone vg_data/src vg_backup/src_copy --dry-run -v

Fall back to the ddpt-based driver (e.g. for cross-vendor compatibility testing):

sudo lvm-xcopy clone vg_data/src vg_data/src_copy --driver ddpt --bpt 32768

Testing

Unit tests run on any platform (the sgio module imports Linux-only modules lazily so it loads on Windows / macOS too):

PYTHONPATH=src python3 -m unittest discover -s tests -v

End-to-end and performance scripts live in scripts/ and expect two Pure FlashArray LUNs exposed via /dev/mapper/<wwid> (SCSI) or two namespaces of the same NVM subsystem exposed as /dev/nvme*n* (NVMe):

Script Purpose
scripts/sgio_smoke_test.sh 1 MiB single-segment CDB sanity check
scripts/sgio_segcount_probe.sh Probe the array's actual segment-descriptor cap
scripts/e2e_two_luns.sh 1 GiB intra-LUN + inter-LUN clone with SHA256 verification
scripts/e2e_multisegment.sh Fragmented source LV across two PE ranges, inter-LUN clone
scripts/sgio_perf_scale.sh Wall-clock timing at 1 / 4 / 16 / 64 GiB
scripts/nvme_xns_e2e.sh Cross-namespace lvm-xcopy clone with CDFE reset + auto-enable proof
scripts/nvme_hbs_cdfe.sh Diagnostic: toggle the controller's CDFE bits and replay F2h/F3h CDBs

Edit the LUN_A / LUN_B WWIDs (or NVMe device paths) at the top of each script before running.

Troubleshooting

SCSI XCOPY failures surface as an SgIoError (sgio driver) or a non-zero ddpt exit with sense bytes. The sense data is printed in hex; the bytes that matter are the Sense Key (byte 2 low nibble), ASC (byte 12), and ASCQ (byte 13). NVMe Copy failures surface as an NvmeError carrying the 16-bit NVMe completion status (e.g. 0x4002 = Invalid Field in Command). Common failures seen against Pure FlashArray:

Sense Meaning Likely cause / fix
KEY=05 ASC=26 ASCQ=00 Invalid field in parameter list Usually a malformed header. Pure requires the SPC-3 header layout (16-bit target-descriptor length at bytes 2-3, reserved at 4-7), not SPC-4. sgio.py already builds it correctly; any local edit to build_param_list() must preserve this layout.
KEY=05 ASC=26 ASCQ=06 Too many target descriptors More than two target descriptors in the parameter list. The driver only ever emits two (src + dst); this would indicate a code change.
KEY=05 ASC=26 ASCQ=08 Too many segment descriptors Pure empirically caps at 2 segments / CDB even though RECEIVE COPY OPERATING PARAMETERS advertises 32. XcopyDriver.MAX_SEGS_PER_CDB is pinned at 2. If you bump it and this returns, the firmware still enforces 2.
KEY=05 ASC=26 ASCQ=09 Invalid LU identifier The NAA-6 designator in a target descriptor doesn't match any LU the destination array can see. Re-check that the WWIDs on both sides are visible to the same FlashArray and that /dev/mapper/<wwid> resolves on the host.
KEY=05 ASC=26 ASCQ=0A Unexpected inconsistent parameter value Usually LIST_ID_USAGE != 3. Pure holds no LIST_ID state; pass --id-usage 3 (the default).
KEY=05 ASC=24 ASCQ=00 Invalid field in CDB The 16-byte EXTENDED COPY(LID1) CDB is malformed (wrong opcode/service action or a non-zero reserved field). Only happens after manual CDB edits.
NVMe Copy returned status 0x4002 (cross-namespace) Invalid Field in Command CDFE bit for Format 2h is not set on the controller. The driver normally programs it automatically; if the message persists, check that the destination NSID is in the same NVM subsystem as the source (nvme list-subsys) and that the controller advertises F2h in OCFS (nvme id-ctrl /dev/nvmeN).
NVMe Copy returned status 0x4183 Command Size Limit Exceeded More than MSRC+1 source ranges packed into a single CDB. Pure exposes MSRC=0 (one range/CDB); the driver respects this.
ddpt: bpt too large (max 32768 blocks) n/a ddpt refuses --bpt > 32768. Either lower --bpt or switch to --driver=sgio.
lvm-xcopy: must be run as root n/a XCOPY needs O_RDWR on the multipath device. Re-run under sudo.
destination <vg/lv> already exists n/a The destination LV must not exist; lvm-xcopy creates it. Remove it first or pick a new name.
refusing to copy active LV n/a Source LV is active. Deactivate it, freeze its filesystem, or pass --force.

To capture the raw sense bytes from a failing run, re-run with -vv and the driver will log the CDB plus the full 64-byte sense buffer. For deeper inspection the scripts/sgio_segcount_probe.sh script is a minimal reproducer that builds the parameter list, issues a single CDB, and prints the error.

Hypervisor integrations

  • Proxmox VEhypervisors/proxmox-vm-clone.sh wraps lvm-xcopy to clone a full VM (disks offloaded on the array, VM definition rebuilt with qm). See hypervisors/README.md for usage, requirements, and the snapshot-as-volume-chain guard.

Safety notes

  • lvm-xcopy refuses to copy an active source LV unless --force is given. If the source is in use, freeze the filesystem (fsfreeze -f <mnt>) or snapshot it before cloning.
  • The destination LV must not already exist; lvm-xcopy creates it.
  • The offload CDB (SCSI XCOPY or NVMe Copy) is issued on the destination device. Make sure both source and destination are visible on the same array (same FlashArray for SCSI; same NVM subsystem for NVMe) and that the host has O_RDWR on the destination namespace / multipath device.

License

MIT (see pyproject.toml).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors