Skip to content

[vmware] Cache images as VM templates#206

Closed
leust wants to merge 1 commit into
stable/wallaby-m3from
imagecache
Closed

[vmware] Cache images as VM templates#206
leust wants to merge 1 commit into
stable/wallaby-m3from
imagecache

Conversation

@leust
Copy link
Copy Markdown

@leust leust commented Jul 11, 2024

Upon user request, the driver can cache the image as a VM Template and reuse that to create the volume(s). This feature is useful when creating many volumes in parallel from the same image.

Users can request the image cache feature when creating the volume, by passing the use_image_cache='true' as a property (metadata).

The feature must be enabled per backend, for example:

[vmware]
enable_image_cache = true

This will enable the image cache feature for the vmware backend.

The image templates will then be stored in a folder similar to the volumes folder: OpensStack/Project (vmware_image_cache)/Volumes, where {backend}_image_cache is used as a project name.

The driver will periodically delete the cached images that are expired. The expiry time can be controlled via the property image_cache_age_seconds set on the backend configuration.

Only images smaller than the configured image_cache_max_size_gb will be cached.

Change-Id: I6f5e481f6997a180a455b47abe525b93bcf9aa4e

@leust
Copy link
Copy Markdown
Author

leust commented Jul 11, 2024

Needs sapcc/oslo.vmware#46

@leust leust requested review from hemna and joker-at-work July 11, 2024 15:42
@hemna
Copy link
Copy Markdown

hemna commented Jul 11, 2024

Why not use cinder's built in cache mechanism?

@joker-at-work
Copy link
Copy Markdown
Collaborator

Why not use cinder's built in cache mechanism?

I agree that the reasoning for this should be in the commit message.

@leust
Copy link
Copy Markdown
Author

leust commented Jul 15, 2024

I added the reason to the commit message

We're not using the cinder built-in cache functionality because we
need a few extra features:

  • the built-in cache doesn't account for shards. The cache entry
    will be placed on any backend/shard and could trigger a lot of
    slower cross-vc migrations when creating volumes from it.
  • the built-in cache doesn't have a periodic task for deleting the
    expired cache entries
  • we want to cache the images only when the customer requests it

@hemna
Copy link
Copy Markdown

hemna commented Jul 15, 2024

How do we account for the consumed space of the cached volume in the specific datastore so that the scheduler knows how much we are using on that datastore?

@leust
Copy link
Copy Markdown
Author

leust commented Jul 15, 2024

How do we account for the consumed space

The driver reports the free_capacity_gb directly from the datastore.summary.freeSpace so I believe we get this information straight from the VMware API response.

@hemna
Copy link
Copy Markdown

hemna commented Jul 15, 2024

The scheduler still needs to account for the capacity allocated against the datastore though. These cached images will be hidden from what is actually allocated against the datastore.

@leust leust force-pushed the imagecache branch 2 times, most recently from da700b1 to c763bce Compare July 17, 2024 08:20
@leust
Copy link
Copy Markdown
Author

leust commented Jul 17, 2024

The scheduler still needs to account for the capacity allocated against the datastore though

OK, thanks for the hint.

Now the "image cache" capacity is being added to the pool's provisioned_capacity_gb which seems to be used while weighing as well as in the calculate_virtual_free_capacity.

The volume backend reports a new extra_provisioned_capacity_gb that's being recognised by the host_manager and added to the final provisioned_capacity_gb.

Could you please check the new code @hemna ?

hemna
hemna previously approved these changes Jul 25, 2024
Copy link
Copy Markdown
Collaborator

@joker-at-work joker-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we disable enable_image_cache we might still have images around. Would it make sense to run some cleanup or at least count the existing cached images as usage or we we have to make sure that we clean up manually?

Comment thread cinder/volume/drivers/vmware/vmdk.py
Comment on lines +3516 to +3545
return list(itertools.chain(
*[self._get_cached_images_in_folder(folder_ref)
for folder_ref in folder_refs]))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like you could use itertools.chain.from_iterable() instead of itertools.chain(*[…]).

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated
"created_at": date}]

Where
- name: the name of the template VM (set to the image_id)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe add a prefix or postfix or something that makes it possible to distinguish image-cache template-vms from shadow-vms? Especially when things are orphaned and only left as directories on the datastore, it's helpful to distinguish them by name and know what can definitely be deleted.

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated
max_objects = self.configuration.vmware_max_objects_retrieval
options.maxObjects = max_objects
try:
result = self.session.vim.RetrievePropertiesEx(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use WithRetrieval as in _get_image_cache_folder_ref()? This looks to be so much code and I would have expected this code to already exist.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean using oslo.vmware's get_objects() ?
We couldn't use it because that one looks in the rootFolder and we only want to look into the cache folder here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with vutil.WithRetrieval(self._session.vim, retr_res) as retr_objects is what I mean. We only get result here and I would assume all the exception handing and such would already be handled in WithRetrieval. We later only use result.objects.

How do we do it in Nova? Did we also copy the contents of get_objects() there? (That's what I understood from you we basically have to do here)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nova has it's own get_objects()-like called get_inner_objects()
Additionally, WithRetrieval doen't handle this NOT_AUTHENTICATED exception that's thrown when there are no objects in the folder (see nova's _get_image_template_vms).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. Thank you.

What's the use of WithRetrieval then? I thought we added it everywhere in Nova, because code missed to handle cases at times.

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated

cached_images = []
for obj in result.objects:
props = vim_util.propset_dict(obj.propSet)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be that propSet doesn't exist? We sometimes have exceptions like that in Nova.

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated
Comment on lines +2085 to +2088
img_volume = copy.deepcopy(volume)
img_volume['id'] = image_id
img_volume['project_id'] = self._cache_project_name()
img_volume['size'] = image_size_in_bytes / units.Gi
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we keep the other data and what happens with it? Do we really need whatever else is in the volume dict? Would it maybe make sense to be explicit about what we expect to be use going forward? Can there be private information (metadata) that somehow gets copied to another project's cloned root-disk metadata?

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated

def _can_use_image_cache(self, volume, image_size):
requested = (volume['metadata']
.get(CREATE_PARAM_USE_IMAGE_CACHE) == "true")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We we maybe want to add a .lower() or does that come in automatically?

Comment on lines +2070 to +2074
LOG.debug("The requested image cannot be cached because it's "
"too big (%(image_size)s > %(max_size)s)",
{'image_size': image_size,
'max_size': max_size})
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we error out here? The customer requested to use the image-cache and thus expect speedy creations, but with the image they can't get it.
Same question towards requested but not enabled, I guess.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user experience point of view, I'd expect such an error to be returned in the API response.
Otherwise they will ask why their volume is in error state.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. We're too late in the process, because scheduling needs to have happened already. On the other hand, nobody will know ever ...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any conclusion for this?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik, there's a way to add error messages and retrieve them with cinder message list. I think we can go with "ignore what the customer requested" for now, but should investigate this as a follow-up and maybe bring it up in a bigger round for a decision.

Comment thread cinder/volume/drivers/vmware/vmdk.py Outdated

img_backing = None
if self._can_use_image_cache(volume, metadata['size']):
img_backing = self._get_cached_image_backing(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename _get_cached_image_backing() to _get_or_create_cached_image_backing(). Mainly because I was wondering where we would create the cached image backing if we only call a get.

LOG.exception("Failed to delete the expired image %s",
cached_image['name'])

def _get_cached_images(self):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run this function every minute with the stats generation (I think). How long does it take to run in a real-world environment? Do we have to optimize somewhere?

@leust leust force-pushed the imagecache branch 2 times, most recently from 3f6d754 to 8333237 Compare August 26, 2024 06:47
@hemna hemna force-pushed the stable/wallaby-m3 branch from 2b59588 to 17e2b86 Compare August 27, 2024 15:13
Upon user request, the driver can cache the image as a VM Template
and reuse that to create the volume(s). This feature is useful
when creating many volumes in parallel from the same image.

We're not using the cinder built-in cache functionality because we
need a few extra features:
- the built-in cache doesn't account for shards. The cache entry
  will be placed on any backend/shard and could trigger a lot of
  slower cross-vc migrations when creating volumes from it.
- the built-in cache doesn't have a periodic task for deleting the
  expired cache entries
- we want to cache the images only when the customer requests it

Users can request the image cache feature when creating the volume,
by passing the use_image_cache='true' as a property (metadata).

The feature must be enabled per backend, for example:
```
[vmware]
enable_image_cache = true
```
This will enable the image cache feature for the vmware backend.

The image templates will then be stored in a folder similar to the
volumes folder: OpensStack/Project (vmware_image_cache)/Volumes,
where {backend}_image_cache is used as a project name.

The driver will periodically delete the cached images that are
expired. The expiry time can be controlled via the property
`image_cache_age_seconds` set on the backend configuration.

Only images smaller than the configured `image_cache_max_size_gb`
will be cached.

Change-Id: I6f5e481f6997a180a455b47abe525b93bcf9aa4e
@hemna
Copy link
Copy Markdown

hemna commented Nov 14, 2025

Also we can enable cinder's image cache on a per backend basis. the backend is the shard effectively. Yes there is no background job to clear out old images, but the great thing is, the cached images are just volumes under a tenant. They can easily be managed (deleted via aging out), via the nanny. Just fetch all the tenant's images, look at the date when it was created, and delete it via the standard volume api. I think we should try at least test/try the cinder baked in image cache in qa-de-1 first before going to this length to hack up the driver.

@joker-at-work
Copy link
Copy Markdown
Collaborator

The first comments on the PR provide the reasoning why we think the Cinder-based approach is not helping. But it's your call as service-owner in the end.

@Scsabiii
Copy link
Copy Markdown

Scsabiii commented Nov 17, 2025

My oppinion is, that the driver level image_cache is a must with VMware, and sharding.
The problem is, we are not fast enough with the migrations so, that we can't survive another gardener update cycle.
If the sharding is removed, it sill a problem with VMware, that even cinder has the image as source volume, the VMware driver thinks the source not "read-only" so it will attempt to create implicit snapshots, that will create another locking issue/conflict during the clone op. The image-template is a good solution on the driver level, because VMware knows that template VM's are ready-only, and does not create this implicit objects, so there is no problem in parallel cloning ...
Also the cross-shard migration can be slower than the direct glance fetch ..... As VMware does also copy 0-s .....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants