Skip to content

Migrating VM instances fails using arm64 hosts #11956

@yadvr

Description

@yadvr

problem

Migrating systemvms (CPVM/SSVM) from arm64 local-storage based host throws this error using 4.22/RC3:

2025-11-02 00:24:01,001 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:[]) (logid:) Seq 233-8103101629546364942: Processing:  { Ans: , MgmtId: 104339268210986, via: 233, Ver: v1, Flags: 110, [{"com.cloud.agent.api.Answer":{"result":"false","details":"com.cloud.utils.exception.CloudRuntimeException: Could not fetch storage pool 63229d40-f348-4636-8707-74ab46ab28d0 from libvirt due to org.libvirt.LibvirtException: Storage pool not found: no storage pool with matching uuid '63229d40-f348-4636-8707-74ab46ab28d0'
        at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePool(KVMStoragePoolManager.java:286)
        at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePool(KVMStoragePoolManager.java:272)
        at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.disconnectPhysicalDisksViaVmSpec(KVMStoragePoolManager.java:247)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtPrepareForMigrationCommandWrapper.handleRollback(LibvirtPrepareForMigrationCommandWrapper.java:160)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtPrepareForMigrationCommandWrapper.execute(LibvirtPrepareForMigrationCommandWrapper.java:59)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtPrepareForMigrationCommandWrapper.execute(LibvirtPrepareForMigrationCommandWrapper.java:50)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:2280)
        at com.cloud.agent.Agent.processRequest(Agent.java:813)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1295)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)
","wait":"0","bypassHostMaintenance":"false"}}] }

Later also causes NPE:

2025-11-02 00:24:03,278 INFO  [o.a.c.s.m.KvmNonManagedStorageDataMotionStrategy] (Work-Job-Executor-19:[ctx-b7d81f9a, job-15822/job-15823, ctx-a65e4c89]) (logid:95b78c81) Expunging dest volume [id: 1547, state: Ready] as part of failed VM migration with volumes command for VM [1161].
2025-11-02 00:24:03,279 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Work-Job-Executor-19:[ctx-b7d81f9a, job-15822/job-15823, ctx-a65e4c89]) (logid:95b78c81) Failed to copy volume java.lang.NullPointerException: Cannot invoke "com.cloud.storage.VolumeVO.getId()" because "this.volumeVO" is null
        at org.apache.cloudstack.storage.volume.VolumeObject.stateTransit(VolumeObject.java:243)
        at org.apache.cloudstack.storage.volume.VolumeObject.processEvent(VolumeObject.java:413)
        at org.apache.cloudstack.storage.motion.StorageSystemDataMotionStrategy.copyAsync(StorageSystemDataMotionStrategy.java:2199)
        at org.apache.cloudstack.storage.motion.DataMotionServiceImpl.copyAsync(DataMotionServiceImpl.java:136)
        at org.apache.cloudstack.storage.volume.VolumeServiceImpl.migrateVolumes(VolumeServiceImpl.java:2356)
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.migrateVolumes(VolumeOrchestrator.java:1514)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
2025-11-02 00:24:03,364 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-19:[ctx-b7d81f9a, job-15822/job-15823]) (logid:95b78c81) Unable to complete AsyncJob {"accountId":2,"cmd":"com.cloud.vm.VmWorkMigrateWithStorage","cmdInfo":"rO0ABXNyACVjb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZVdpdGhTdG9yYWdlsew9z6UxtXMCAANKAApkZXN0SG9zdElkSgAJc3JjSG9zdElkTAAMdm9sdW1lVG9Qb29sdAAPTGphdmEvdXRpbC9NYXA7eHIAE2NvbS5jbG91ZC52bS5WbVdvcmufmbZW8CVnawIABEoACWFjY291bnRJZEoABnVzZXJJZEoABHZtSWRMAAtoYW5kbGVyTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwAAAAAAAAAAIAAAAAAAAAAgAAAAAAAASJdAAZVmlydHVhbE1hY2hpbmVNYW5hZ2VySW1wbAAAAAAAAADpAAAAAAAAALNzcgARamF2YS51dGlsLkhhc2hNYXAFB9rBwxZg0QMAAkYACmxvYWRGYWN0b3JJAAl0aHJlc2hvbGR4cD9AAAAAAAAAdwgAAAAQAAAAAHg","cmdVersion":0,"completeMsid":null,"created":"Sun Nov 02 00:23:46 IST 2025","id":15823,"initMsid":104339268210986,"instanceId":null,"instanceType":null,"lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"f4950b59-4f53-47dc-856a-5abda1a15964"}, job origin: 15822 com.cloud.utils.exception.CloudRuntimeException: Failed to migrate VM [VM instance {"id":1161,"instanceName":"s-1161-VM","state":"Migrating","type":"SecondaryStorageVm","uuid":"ff0368d4-e71c-4e49-ad70-c7dab8ef7ea9"}] along with its volumes due to [java.lang.NullPointerException: Cannot invoke "com.cloud.storage.VolumeVO.getId()" because "this.volumeVO" is null].
        at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.migrateVolumes(VolumeOrchestrator.java:1520)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)

On the target host, this exists:

root@cloudpi:/home/rohit# virsh pool-dumpxml cc155507-f2ba-4a96-a9b9-2dac9dc50a24
<pool type='dir'>
  <name>cc155507-f2ba-4a96-a9b9-2dac9dc50a24</name>
  <uuid>cc155507-f2ba-4a96-a9b9-2dac9dc50a24</uuid>
  <capacity unit='bytes'>1969186340864</capacity>
  <allocation unit='bytes'>1526301487104</allocation>
  <available unit='bytes'>442884853760</available>
  <source>
  </source>
  <target>
    <path>/var/lib/libvirt/images</path>
    <permissions>
      <mode>0711</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</pool>

On the source host, this exists:

root@kvmpi:/home/rohit# virsh pool-dumpxml 63229d40-f348-4636-8707-74ab46ab28d0
<pool type='dir'>
  <name>63229d40-f348-4636-8707-74ab46ab28d0</name>
  <uuid>63229d40-f348-4636-8707-74ab46ab28d0</uuid>
  <capacity unit='bytes'>1969186340864</capacity>
  <allocation unit='bytes'>939565473792</allocation>
  <available unit='bytes'>1029620867072</available>
  <source>
  </source>
  <target>
    <path>/var/lib/libvirt/images</path>
    <permissions>
      <mode>0711</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</pool>

In ACS DB: storage pool are defined (source is pikvm, dest is cloudpi)

*************************** 2. row ***************************
                   id: 44
                 name: local-pikvm-arm64
                 uuid: 63229d40-f348-4636-8707-74ab46ab28d0
            pool_type: Filesystem
                 port: 0
       data_center_id: 1
               pod_id: 1
           cluster_id: 13
           used_bytes: 939565735936
       capacity_bytes: 1969186340864
         host_address: 10.10.1.10
            user_info: NULL
                 path: /var/lib/libvirt/images
              created: 2024-07-29 15:40:15
              removed: NULL
          update_time: 2025-11-01 19:00:44
               status: Up
storage_provider_name: DefaultPrimary
                scope: HOST
           hypervisor: KVM
              managed: 0
        capacity_iops: NULL
               parent: 0
            used_iops: 24
*************************** 3. row ***************************
                   id: 47
                 name: local-cloudpi-arm64
                 uuid: cc155507-f2ba-4a96-a9b9-2dac9dc50a24
            pool_type: Filesystem
                 port: 0
       data_center_id: 1
               pod_id: 1
           cluster_id: 13
           used_bytes: 1527497064448
       capacity_bytes: 1969186340864
         host_address: 10.10.1.5
            user_info: NULL
                 path: /var/lib/libvirt/images
              created: 2025-08-10 11:13:27
              removed: NULL
          update_time: 2025-11-01 19:06:01
               status: Up
storage_provider_name: DefaultPrimary
                scope: HOST
           hypervisor: KVM
              managed: 0
        capacity_iops: NULL
               parent: 0
            used_iops: 49

versions

ACS 4.22 RC3, adv zone, arm64+x86 mix host env. Preferred arch set to arm64.

The steps to reproduce the bug

See description above

What to do about it?

Ideally the local storage pool of the target host should be used in the domain/xml

Metadata

Metadata

Type

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions