Skip to content

Proposed version 0.2.0#2

Open
matyasselmeci wants to merge 34 commits intoPelicanPlatform:mainfrom
matyasselmeci:pr/v0.2.0
Open

Proposed version 0.2.0#2
matyasselmeci wants to merge 34 commits intoPelicanPlatform:mainfrom
matyasselmeci:pr/v0.2.0

Conversation

@matyasselmeci
Copy link
Copy Markdown
Contributor

This PR contains many changes in order to reflect both how we (OSDF ops) want to set up our caches and community conventions for how Helm charts should look.

In addition, the PR adds support for host networking and various othe rfeatures, plus safety checks to avoid inconsistent deployments.

Notable changes:

Features

  • Host networking is available as an option; Service and NetworkPolicy won't be created if host networking is enabled.
  • Admin users are now specified as a YAML list instead of a string.
  • Reusing the cache volume for logging is now possible (this is something we do in NRP).
  • Cache, lotman, and logging can now use existing PVCs or cause creation of new ones;
    in the latter case, "storageClass" must be specified.
  • All created PVCs are now annotated such that they will be kept even if the chart is deleted.
  • Setting sleep: true causes the cache to sleep instead of running, for debugging purposes.

Validation

  • federation.label must match federation.discoveryUrl for the "osdf" and "osdf-itb" federations:
    ("osdf" must have a discoveryUrl of "https://osg-htc.org"; "osdf-itb" must have a discoveryUrl of "https://osdf-itb.osg-htc.org")
  • If creating new PVC(s) (instead of reusing existing ones), the corresponding storageClass attributes must be specified for each.
  • If using a host path, the path must be specified.
  • sitename (used for Xrootd.Sitename in the Pelican config) must be specified.
  • tls.certManager.enabled and tls.existingSecret cannot both be set.
  • If using CertManager, serverHostname must be specified; it's automatically added to the DNS names list.

Various renames

  • "Namespace key" has been renamed to "issuer key" to match what Pelican calls it - in addition, the default key in the secret for the issuer key is private-key.pem, which is what pelican key generate creates.
  • "storageClassName" has been renamed to "storageClass" per community conventions.
  • Various settings have been collected and nested:
    • Logrotate config is no longer split between logrotateImage, resources.logrotate, and logrotate; these have been merged into the logrotate mapping.
    • hostPath, storageClassName, pvcSize are no longer mixed together under cache; now you have a cache.hostPath mapping with path, and a cache.pvc mapping with existingClaim, storageClass, and size
    • Cache resources have been moved under the cache mapping.
    • tls.existingSecret and certificate (for CerManager config) are no longer separate: now there are tls.certManager and tls.existingSecret.
    • webPasswordSecret and webPasswordSecretKey have been renamed to webPassword.existingSecret and webPassword.key.
    • xrootd.sitename has been pulled out and moved to the top as just sitename; as mentioned above, it's required.
    • logging.storageClassName and logging.pvcSize have been moved to logging.persistence.storageClass and logging.persistence.size; there are also options to not create a separate volume (logging.persistence.separateVolume: false, or reuse an existing PVC (logging.persistence.existingClaim))

Minor changes

  • The default image pull policy for the cache image is now IfNotPresent since the cache uses an image with a version tag.
    The logrotate image still uses the Always policy.
    As per community convention, the chart's AppVersion is used as the default image tag instead of being specifiedi n the default values file.
  • The default size of the logging PVC has been raised from 5 Gi to 50 Gi.
  • XRD_CURLDISABLEX509=1 is now always set in the environment.

matyasselmeci and others added 22 commits April 1, 2026 17:51
…gmap-pelican.yaml

Since both of these files contain Pelican configuration, splitting them up
hurts the organization because the information is not in one place.

Co-authored-by: Copilot <copilot@github.com>
* Use chart's appVersion as the default Pelican image tag.
* Remove `app: pelican-cache` label - Helm's own labels are sufficient.
* Remove deprecated `Cache.DataLocation` var.
* Remove the config-dir-placeholder; in current images, it is unnecessary.
* Avoid having an empty Cache block if none of the cache options are customized.
* Rename the "cache-data" volume to just "data" to avoid having
  a "pelican-cache-cache-data" name when the prefixes are applied.
* Add a note explaining why we can't mount the TLS cert/key at Pelican's
  default location.
* Change UIAdminUsers to a YAML list instead of space-separated values.
* Remove 'federation' from selector labels because selector labels are
  immutable and prevent chart upgrades in case the federation changes.

Co-authored-by: Copilot <copilot@github.com>
…ificate.

certificate.dnsNames will be a list of additional SANs instead.
This avoids an admission webhook error with the default values when only
serverHostname is specified.

Co-authored-by: Copilot <copilot@github.com>
Do not specify IssuerKey in configmap-pelican.yaml. Instead, if the issuer key
is of type "existingSecret", mount the key at
"/etc/pelican/issuer-keys/issuer.pem". If the type is "pvc", mount it at
"/etc/pelican/issuer-keys".

Co-authored-by: Copilot <copilot@github.com>
…delete PVCs on uninstall

Co-authored-by: Copilot <copilot@github.com>
The two federations we want this check for are osdf and osdf-itb, with
https://osg-htc.org and https://osdf-itb.osg-htc.org as their discoveryUrls,
respectively.

Co-authored-by: Copilot <copilot@github.com>
*   Specify the type of storage for a cache by `cache.type`;
    if `cache.type==pvc`, then the specifics go under `cache.pvc`;
    if `cache.type==hostPath`, then the specifics go under `cache.hostPath`.

*   Rename storageClassName to storageClass since that's more common.

*   Make sure specifying volumes are consistent between the `cache` section,
    `logging` section and `lotman` section.

*   Rename "namespaceKey" to "issuerKey" and the default secret key name to
    "private-key.pem"; "issuer key" is the terminology Pelican uses and
    "namespace key" doesn't make much sense for a cache; private-key.pem is the
    name of the file that `pelican key create` creates.

*   Add to NOTES.txt a list of all the PVCs we created; also mention that we
    do not delete them on uninstall.

*   Add validation for PVCs and volumes.
    *   If `cache.type==pvc`, require `cache.pvc.existingClaim` or
        `cache.pvc.storageClass`.
    *   If `cache.type==hostPath`, require `cache.hostPath.path`.
    *   If `issuerKey.type==pvc`, require `issuerKey.pvc.storageClass`.
    *   If `issuerKey.type==existingSecret`, require
        `issuerKey.existingSecret`.
    *   If lotman is enabled, require `lotman.pvc.existingClaim` or
        `lotman.pvc.storageClass`.
    *   If oidc is enabled, require `oidc.existingSecret`.
    *   Require `webPasswordSecret`.

Co-authored-by: Copilot <copilot@github.com>
*   There were multiple top-level blocks regarding logging and log rotation
    (for example, resources were in one place, the images were in another);
    these have been consolidated.

*   Renamed logging.persist to logging.persistence and put the various
    options under it.

    Instead of logging.persistence.enabled, we have
    logging.persistence.separateVolume because we always persist logs,
    it's just that on the NRP caches we put the logs on the cache volume.

*   If we're not logging to a separate volume, mount the cache data volume at
    /var/log (expecting the logs to live in pelican/*.log under the data
    volume), following what the Houston I2 cache does.

    NOTE: This is different than what some of the OSStore Origins do (they
    mount a subPath of the data volume as /var/log/pelican) so we will have
    to see if those two patterns can be consolidated.

*   Fix a brittle hasKey check in the deployment template.

Co-authored-by: Copilot <copilot@github.com>
that is the default filename that `pelican generate password` creates.
Since it's a required value, bring it up to the top instead of mixing it in
with the rest of the XRootD config.

Co-authored-by: Copilot <copilot@github.com>
Do not change the imagePullPolicy for the logrotate image; we want that one
to be up to date.

Co-authored-by: Copilot <copilot@github.com>
…unning, for debugging

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Nest the certManager config under tls so certificate-related knobs are
specified together.

Co-authored-by: Copilot <copilot@github.com>
Put the resource requests/limits for the cache container itself under the
cache block, for consistency with the way we set resources/limits for the
logrotate container.

Co-authored-by: Copilot <copilot@github.com>
Rename webPasswordSecret and webPasswordSecretKey to webPassword.existingSecret
and webPassword.key, for consistency with the issuerKey and tls config.

Co-authored-by: Copilot <copilot@github.com>
Use of client X.509 certs breaks on many of our caches since Let's Encrypt
certificates cannot be used for client auth anymore.  Disable them.

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Also drop information about tiger-osg-config - since that's a private repo,
the information is of no use to others.

Co-authored-by: Copilot <copilot@github.com>
…esemble the existing cache

Co-authored-by: Copilot <copilot@github.com>
Comment thread values.yaml Outdated
Copy link
Copy Markdown
Contributor

@brianhlin brianhlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the organization of values.yaml odd, almost as if a machine wrote it :)

Comment thread templates/_helpers.tpl Outdated
Comment thread templates/configmap-pelican.yaml Outdated
Comment thread templates/configmap-pelican.yaml Outdated
Comment thread templates/deployment.yaml Outdated
Comment thread templates/deployment.yaml Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md
matyasselmeci and others added 5 commits April 23, 2026 15:47
The relevant bug was fixed in 7.23.1 and 7.24:
PelicanPlatform/pelican#3159

Co-authored-by: Claude <claude@anthropic.com>
…eparate label value that must match it

This simplifies configuration and processing since we no longer need a separate
validation for it.  Note that the label needs to be sanitized because `:` and
`/` are not allowed in labels.

While we're at it, add the `pelicanplatform.org/` prefix to the label,
because unprefixed labels should be reserved for the local cluster admin.

Co-authored-by: Copilot <copilot@github.com>
matyasselmeci and others added 7 commits April 28, 2026 17:06
Co-authored-by: Copilot <copilot@github.com>
Also move the definition out of `_helpers.tpl` since it's only used in one place.

Co-authored-by: Copilot <copilot@github.com>
… logrotate image settings

Also use the `pelican_platform/cache` image instead of the `pelican_platform/osdf-cache` image to be more generic.

Co-authored-by: Copilot <copilot@github.com>
…adminUsers is nonempty

Co-authored-by: Copilot <copilot@github.com>
…tes settings

The `cache` block now only controls the Kubernetes resources for the cache, and
the `logging` and `logrotate` blocks now only control the Kubernetes resources
for the logging PVC and logrotate image.

Application settings (cache tuning like `highWaterMark`, logging levels,
logrotate size and count) have been moved into new `cacheConfig` and
`loggingConfig` blocks.

Co-authored-by: Copilot <copilot@github.com>
@matyasselmeci
Copy link
Copy Markdown
Contributor Author

OK, this is ready for another look. Other than your comments, one thing I changed was to split out the Pelican cache configuration (high water mark, low water mark, etc.) from the Kubernetes configuration (cache image, pvc, etc.). Same for logging/logrotate. I think it's cleaner if application and Kubernetes parameters aren't mixed together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants