feat: download kubelet with oras in network isolated windows cluster#8042
feat: download kubelet with oras in network isolated windows cluster#8042jiashun0011 wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for downloading Windows kubelet binaries via ORAS when running in a network-isolated cluster context (using the bootstrap profile container registry), and extends test coverage to exercise the new path.
Changes:
- Update
Get-KubePackageto download kubelet binaries via ORAS whenBootstrapProfileContainerRegistryServeris set; otherwise keep HTTP download behavior. - Add/extend Pester tests for ORAS vs HTTP selection behavior in
Get-KubePackage. - Add a new Windows network-isolated e2e scenario intended to validate ORAS-based kubelet download behavior.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| staging/cse/windows/kubeletfunc.ps1 | Adds ORAS-based kubelet package download branch when bootstrap profile registry is configured. |
| staging/cse/windows/kubeletfunc.tests.ps1 | Adds Pester contexts to cover ORAS vs HTTP download paths for kubelet package retrieval. |
| e2e/scenario_win_test.go | Adds a network-isolated Windows e2e test and alters Windows 2025 bootstrap mutator to use a custom CSE package URL. |
You can also share your feedback on Copilot code review. Take the survey.
| # The ORAS invocation line in Get-KubePackage evaluates $global:OrasPath | ||
| # and then invokes 'pull' as a separate command with the remaining arguments. | ||
| # Define and mock 'pull' so the invocation succeeds silently in tests. | ||
| function pull { } | ||
| Mock pull -MockWith {} |
There was a problem hiding this comment.
These tests are currently structured around the broken ORAS invocation semantics (mocking a pull function so $global:OrasPath pull ... doesn't error). That means the test suite would still pass even if the script never invokes oras.exe. After fixing Get-KubePackage to correctly run ORAS (e.g., via & $global:OrasPath pull ...), update the tests to validate the ORAS invocation through a mockable wrapper function (e.g., Invoke-OrasPull) instead of mocking pull.
|
forget to delete? |
e2e/scenario_win_test.go
Outdated
| func EmptyBootstrapConfigMutator(configuration *datamodel.NodeBootstrappingConfiguration) {} | ||
| func EmptyVMConfigMutator(vmss *armcompute.VirtualMachineScaleSet) {} | ||
|
|
||
| const jiashunaburl = `https://jiashunabtest.blob.core.windows.net/abtest/staging-cse-windows.zip?se=2026-03-15T00%3A00Z&sp=r&spr=https&sv=2022-11-02&sr=b&skoid=6a0c601f-38a9-49f1-ad79-7ede58edd6ac&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2026-03-09T05%3A10%3A00Z&ske=2026-03-15T00%3A00%3A00Z&sks=b&skv=2022-11-02&sig=5zE0VzebesOPZKJGy4ULksiGuGIF1mxOaZzNTO0IAvo%3D` |
There was a problem hiding this comment.
forget to delete the staging-cse-windows.zip
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinariesWithOras" -TaskMessage "Start to download kubelet binaries with oras. KubeBinariesVersion: $global:KubeBinariesVersion, BootstrapProfileContainerRegistryServer: $global:BootstrapProfileContainerRegistryServer" | ||
| $global:OrasPath pull $global:BootstrapProfileContainerRegistryServer/aks/packages/kubernetes/windowszip:$global:KubeBinariesVersion --platform="windows/amd64" --registry-config=$global:OrasRegistryConfigFile --output $zipfile | ||
| } else { | ||
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinaries" -TaskMessage "Start to download kubelet binaries and unzip. KubeBinariesPackageSASURL: $KubeBinariesSASURL" |
There was a problem hiding this comment.
it is not only kubelet. use windowszip or kubernetes package?
There was a problem hiding this comment.
keep same with previous message?
staging/cse/windows/kubeletfunc.ps1
Outdated
| # download kubelet binraries based on whether BootstrapProfileContainerRegistryServer is set. | ||
| if ($global:BootstrapProfileContainerRegistryServer) { | ||
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinariesWithOras" -TaskMessage "Start to download kubelet binaries with oras. KubeBinariesVersion: $global:KubeBinariesVersion, BootstrapProfileContainerRegistryServer: $global:BootstrapProfileContainerRegistryServer" | ||
| $global:OrasPath pull $global:BootstrapProfileContainerRegistryServer/aks/packages/kubernetes/windowszip:$global:KubeBinariesVersion --platform="windows/amd64" --registry-config=$global:OrasRegistryConfigFile --output $zipfile |
staging/cse/windows/kubeletfunc.ps1
Outdated
| # download kubelet binraries based on whether BootstrapProfileContainerRegistryServer is set. | ||
| if ($global:BootstrapProfileContainerRegistryServer) { | ||
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinariesWithOras" -TaskMessage "Start to download kubelet binaries with oras. KubeBinariesVersion: $global:KubeBinariesVersion, BootstrapProfileContainerRegistryServer: $global:BootstrapProfileContainerRegistryServer" | ||
| $global:OrasPath pull $global:BootstrapProfileContainerRegistryServer/aks/packages/kubernetes/windowszip:$global:KubeBinariesVersion --platform="windows/amd64" --registry-config=$global:OrasRegistryConfigFile --output $zipfile |
There was a problem hiding this comment.
is it possible that windowszip version is different from k8sversion?
for example, k8s version is v1.30.100, package version might be v1.30.100-lts (with suffix)
76afb46 to
4450dfb
Compare
4450dfb to
bac8e29
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
You can also share your feedback on Copilot code review. Take the survey.
| Mock Set-ExitCode -MockWith { throw "Set-ExitCode:$($ExitCode):$ErrorMessage" } | ||
| Mock Get-Item -MockWith { return New-Object -TypeName PSObject -Property @{ FullName = $DestinationPath } } |
There was a problem hiding this comment.
The Set-ExitCode mock here doesn't declare parameters, so $ExitCode/$ErrorMessage will be $null and the thrown message won't include the actual values. Add a param($ExitCode, $ErrorMessage) (or use $args) and also update the Get-Item mock to use the mocked cmdlet's input parameter (e.g., $Path) instead of an undefined $DestinationPath variable.
| Mock Set-ExitCode -MockWith { throw "Set-ExitCode:$($ExitCode):$ErrorMessage" } | |
| Mock Get-Item -MockWith { return New-Object -TypeName PSObject -Property @{ FullName = $DestinationPath } } | |
| Mock Set-ExitCode -MockWith { | |
| Param( | |
| [Parameter(Mandatory = $true)][int]$ExitCode, | |
| [Parameter(Mandatory = $true)][string]$ErrorMessage | |
| ) | |
| throw "Set-ExitCode:$ExitCode:$ErrorMessage" | |
| } | |
| Mock Get-Item -MockWith { | |
| Param( | |
| [Parameter(Position = 0, Mandatory = $true)][string]$Path | |
| ) | |
| return New-Object -TypeName PSObject -Property @{ FullName = $Path } | |
| } |
| { DownloadFileWithOras -Reference $reference -DestinationPath $destPath -ExitCode 80 } | Should -Not -Throw | ||
|
|
||
| Assert-MockCalled -CommandName 'Write-Log' -ParameterFilter { | ||
| $message -like "*platform=windows/amd64*" | ||
| } |
There was a problem hiding this comment.
Assert-MockCalled -CommandName 'Write-Log' will fail unless Write-Log is mocked in this test scope. Add a Mock Write-Log (and capture parameters if you want to assert on $message).
| Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_ORAS_PULL_WINDOWSZIP_FAIL -ErrorMessage "DownloadFileWithOras function is not available. networkisolatedclusterfunc.ps1 may not be sourced." | ||
| } | ||
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinariesWithOras" -TaskMessage "Start to download kubelet binaries with oras. KubeBinariesVersion: $global:KubeBinariesVersion, BootstrapProfileContainerRegistryServer: $global:BootstrapProfileContainerRegistryServer" | ||
| $orasReference = "$($global:BootstrapProfileContainerRegistryServer)/aks/packages/kubernetes/windowszip:v$($global:KubeBinariesVersion)" |
There was a problem hiding this comment.
The ORAS reference format here (windowszip:v$KubeBinariesVersion) doesn't match the new unit tests (they expect windowszip:$KubeBinariesVersion). This will cause test failures and likely an incorrect tag at runtime if the registry artifact tags don't include a leading v. Align the reference format with the actual published tag naming (and update tests accordingly). Also consider sanitizing $global:BootstrapProfileContainerRegistryServer (strip http[s]:// and trailing /) like Set-BootstrapProfileRegistryContainerdHost does, otherwise an input like https://myacr.azurecr.io/some/path/ will produce an invalid ORAS reference.
| $orasReference = "$($global:BootstrapProfileContainerRegistryServer)/aks/packages/kubernetes/windowszip:v$($global:KubeBinariesVersion)" | |
| # Sanitize the registry server to remove protocol prefixes and trailing slashes, | |
| # so inputs like 'https://myacr.azurecr.io/some/path/' do not produce invalid ORAS references. | |
| $registryServer = $global:BootstrapProfileContainerRegistryServer | |
| $registryServer = $registryServer -replace '^https?://', '' | |
| $registryServer = $registryServer.TrimEnd('/') | |
| $orasReference = "$registryServer/aks/packages/kubernetes/windowszip:$($global:KubeBinariesVersion)" |
| for ($i = 0; $i -le $maxRetries; $i++) { | ||
| try { | ||
| DownloadFileWithOras -Reference $orasReference -DestinationPath $zipfile -ExitCode $global:WINDOWS_CSE_ERROR_ORAS_PULL_WINDOWSZIP_FAIL | ||
| break | ||
| } catch { | ||
| Write-Log "Attempt ${i}: oras pull failed for $orasReference. Error: $_" | ||
| if ($i -eq $maxRetries) { | ||
| Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_ORAS_PULL_WINDOWSZIP_FAIL -ErrorMessage "Exhausted retries for oras pull $orasReference. Error: $_" | ||
| } |
There was a problem hiding this comment.
This retry loop will not work for the common failure paths because DownloadFileWithOras calls Set-ExitCode, which exits the process (bypassing try/catch). As a result, the script will terminate on the first ORAS failure and never retry. Refactor so the ORAS helper returns/throws without exiting, and only call Set-ExitCode after retries are exhausted.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
You can also share your feedback on Copilot code review. Take the survey.
| $downloadTimer = [System.Diagnostics.Stopwatch]::StartNew() | ||
| $orasArgs = @( | ||
| "pull", | ||
| $Reference, | ||
| "--platform=$Platform", | ||
| "--registry-config=$($global:OrasRegistryConfigFile)", | ||
| "--output", $tempDir | ||
| ) | ||
| & $global:OrasPath @orasArgs | ||
| if ($LASTEXITCODE -ne 0) { | ||
| $downloadTimer.Stop() | ||
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | ||
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull failed with exit code $LASTEXITCODE for $Reference" | ||
| } | ||
| $downloadTimer.Stop() | ||
| $elapsedMs = $downloadTimer.ElapsedMilliseconds | ||
|
|
||
| # Find the downloaded file in the temp directory and move it to the desired path | ||
| $downloadedFile = Get-ChildItem -Path $tempDir -File | Select-Object -First 1 | ||
| if (-not $downloadedFile) { | ||
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | ||
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull succeeded but no file found in temp directory for $Reference" | ||
| } | ||
|
|
||
| Write-Log "Downloaded file name: $($downloadedFile.Name) (size: $($downloadedFile.Length) bytes) in temp directory $tempDir" | ||
|
|
||
| # Ensure the destination parent directory exists | ||
| $destDir = [System.IO.Path]::GetDirectoryName($DestinationPath) | ||
| if ($destDir -and -not (Test-Path $destDir)) { | ||
| New-Item -ItemType Directory -Path $destDir -Force | Out-Null | ||
| } | ||
|
|
||
| # Remove existing destination if present, then move the downloaded file | ||
| if (Test-Path $DestinationPath) { | ||
| Remove-Item -Path $DestinationPath -Force | ||
| } | ||
| Write-Log "Moving $($downloadedFile.FullName) to $DestinationPath" | ||
| Move-Item -Path $downloadedFile.FullName -Destination $DestinationPath -Force | ||
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | ||
|
|
||
| if ($global:AppInsightsClient -ne $null) { | ||
| $event = New-Object "Microsoft.ApplicationInsights.DataContracts.EventTelemetry" | ||
| $event.Name = "FileDownload" | ||
| $event.Properties["FileName"] = $Reference | ||
| $event.Properties["Method"] = "oras" | ||
| $event.Metrics["DurationMs"] = $elapsedMs | ||
| $global:AppInsightsClient.TrackEvent($event) | ||
| } | ||
|
|
||
| Write-Log "Downloaded $Reference to $DestinationPath via oras in $elapsedMs ms" | ||
| Get-Item $DestinationPath -ErrorAction Continue | Format-List | Out-String | Write-Log |
There was a problem hiding this comment.
DownloadFileWithOras creates a temporary directory and cleans it up in the success path and some error paths, but if Move-Item (or subsequent logging) throws, the temp directory will be left behind. Since this runs during provisioning, repeated failures could accumulate temp dirs/files. Wrap the download/move logic in try/finally so $tempDir is always removed.
| $downloadTimer = [System.Diagnostics.Stopwatch]::StartNew() | |
| $orasArgs = @( | |
| "pull", | |
| $Reference, | |
| "--platform=$Platform", | |
| "--registry-config=$($global:OrasRegistryConfigFile)", | |
| "--output", $tempDir | |
| ) | |
| & $global:OrasPath @orasArgs | |
| if ($LASTEXITCODE -ne 0) { | |
| $downloadTimer.Stop() | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull failed with exit code $LASTEXITCODE for $Reference" | |
| } | |
| $downloadTimer.Stop() | |
| $elapsedMs = $downloadTimer.ElapsedMilliseconds | |
| # Find the downloaded file in the temp directory and move it to the desired path | |
| $downloadedFile = Get-ChildItem -Path $tempDir -File | Select-Object -First 1 | |
| if (-not $downloadedFile) { | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull succeeded but no file found in temp directory for $Reference" | |
| } | |
| Write-Log "Downloaded file name: $($downloadedFile.Name) (size: $($downloadedFile.Length) bytes) in temp directory $tempDir" | |
| # Ensure the destination parent directory exists | |
| $destDir = [System.IO.Path]::GetDirectoryName($DestinationPath) | |
| if ($destDir -and -not (Test-Path $destDir)) { | |
| New-Item -ItemType Directory -Path $destDir -Force | Out-Null | |
| } | |
| # Remove existing destination if present, then move the downloaded file | |
| if (Test-Path $DestinationPath) { | |
| Remove-Item -Path $DestinationPath -Force | |
| } | |
| Write-Log "Moving $($downloadedFile.FullName) to $DestinationPath" | |
| Move-Item -Path $downloadedFile.FullName -Destination $DestinationPath -Force | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| if ($global:AppInsightsClient -ne $null) { | |
| $event = New-Object "Microsoft.ApplicationInsights.DataContracts.EventTelemetry" | |
| $event.Name = "FileDownload" | |
| $event.Properties["FileName"] = $Reference | |
| $event.Properties["Method"] = "oras" | |
| $event.Metrics["DurationMs"] = $elapsedMs | |
| $global:AppInsightsClient.TrackEvent($event) | |
| } | |
| Write-Log "Downloaded $Reference to $DestinationPath via oras in $elapsedMs ms" | |
| Get-Item $DestinationPath -ErrorAction Continue | Format-List | Out-String | Write-Log | |
| try { | |
| $downloadTimer = [System.Diagnostics.Stopwatch]::StartNew() | |
| $orasArgs = @( | |
| "pull", | |
| $Reference, | |
| "--platform=$Platform", | |
| "--registry-config=$($global:OrasRegistryConfigFile)", | |
| "--output", $tempDir | |
| ) | |
| & $global:OrasPath @orasArgs | |
| if ($LASTEXITCODE -ne 0) { | |
| $downloadTimer.Stop() | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull failed with exit code $LASTEXITCODE for $Reference" | |
| } | |
| $downloadTimer.Stop() | |
| $elapsedMs = $downloadTimer.ElapsedMilliseconds | |
| # Find the downloaded file in the temp directory and move it to the desired path | |
| $downloadedFile = Get-ChildItem -Path $tempDir -File | Select-Object -First 1 | |
| if (-not $downloadedFile) { | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| Set-ExitCode -ExitCode $ExitCode -ErrorMessage "oras pull succeeded but no file found in temp directory for $Reference" | |
| } | |
| Write-Log "Downloaded file name: $($downloadedFile.Name) (size: $($downloadedFile.Length) bytes) in temp directory $tempDir" | |
| # Ensure the destination parent directory exists | |
| $destDir = [System.IO.Path]::GetDirectoryName($DestinationPath) | |
| if ($destDir -and -not (Test-Path $destDir)) { | |
| New-Item -ItemType Directory -Path $destDir -Force | Out-Null | |
| } | |
| # Remove existing destination if present, then move the downloaded file | |
| if (Test-Path $DestinationPath) { | |
| Remove-Item -Path $DestinationPath -Force | |
| } | |
| Write-Log "Moving $($downloadedFile.FullName) to $DestinationPath" | |
| Move-Item -Path $downloadedFile.FullName -Destination $DestinationPath -Force | |
| if ($global:AppInsightsClient -ne $null) { | |
| $event = New-Object "Microsoft.ApplicationInsights.DataContracts.EventTelemetry" | |
| $event.Name = "FileDownload" | |
| $event.Properties["FileName"] = $Reference | |
| $event.Properties["Method"] = "oras" | |
| $event.Metrics["DurationMs"] = $elapsedMs | |
| $global:AppInsightsClient.TrackEvent($event) | |
| } | |
| Write-Log "Downloaded $Reference to $DestinationPath via oras in $elapsedMs ms" | |
| Get-Item $DestinationPath -ErrorAction Continue | Format-List | Out-String | Write-Log | |
| } | |
| finally { | |
| Remove-Item -Path $tempDir -Recurse -Force -ErrorAction SilentlyContinue | |
| } |
| nbc.ContainerService.Properties.SecurityProfile = &datamodel.SecurityProfile{ | ||
| PrivateEgress: &datamodel.PrivateEgress{ | ||
| Enabled: true, | ||
| ContainerRegistryServer: fmt.Sprintf("%s.azurecr.io/aks-managed-repository", config.PrivateACRName(config.Config.DefaultLocation)), |
There was a problem hiding this comment.
This test config sets ContainerRegistryServer to a private ACR that has anonymous pull disabled (PrivateACRName), but the Windows CSE ORAS flow in this PR does not perform any oras login/credential setup. Unless credentials are being provisioned elsewhere, the ORAS pull will fail against a non-anonymous ACR. Consider using the anonymous-enabled ACR for this scenario (PrivateACRNameNotAnon) or adding explicit ORAS auth setup as part of Initialize-Oras.
| ContainerRegistryServer: fmt.Sprintf("%s.azurecr.io/aks-managed-repository", config.PrivateACRName(config.Config.DefaultLocation)), | |
| ContainerRegistryServer: fmt.Sprintf("%s.azurecr.io/aks-managed-repository", config.PrivateACRNameNotAnon(config.Config.DefaultLocation)), |
| It "Should call DownloadFileWithOras with correct reference when BootstrapProfileContainerRegistryServer is set" { | ||
| Get-KubePackage -KubeBinariesSASURL 'https://xxx.blob.core.windows.net/kubernetes/v1.29.2/windowszip/v1.29.2-1int.zip' | ||
| Assert-MockCalled -CommandName 'DownloadFileWithOras' -Exactly -Times 1 -ParameterFilter { | ||
| $Reference -eq 'myregistry.azurecr.io/aks/packages/kubernetes/windowszip:v1.29.2' -and | ||
| $DestinationPath -eq 'c:\k.zip' -and | ||
| $ExitCode -eq $global:WINDOWS_CSE_ERROR_ORAS_PULL_WINDOWSZIP_FAIL | ||
| } |
There was a problem hiding this comment.
The expected ORAS reference in this assertion (.../windowszip:1.29.2) does not match the current implementation in Get-KubePackage, which builds .../windowszip:v1.29.2. Either the implementation or the test expectation needs to be updated so the tag format is consistent (and matches the registry artifact naming).
| } | ||
| }, | ||
| Validator: func(ctx context.Context, s *Scenario) { | ||
| ValidateWindowsVersionFromWindowsSettings(ctx, s, "2025-gen2") |
There was a problem hiding this comment.
no need for these 2025-gen2 validations?
| break | ||
|
|
||
| # download kubelet binaries based on whether BootstrapProfileContainerRegistryServer is set. | ||
| if ($global:BootstrapProfileContainerRegistryServer) { |
| } | ||
| Logs-To-Event -TaskName "AKS.WindowsCSE.DownloadKubletBinariesWithOras" -TaskMessage "Start to download kubelet binaries with oras. KubeBinariesVersion: $global:KubeBinariesVersion, BootstrapProfileContainerRegistryServer: $global:BootstrapProfileContainerRegistryServer" | ||
| $orasReference = "$($global:BootstrapProfileContainerRegistryServer)/aks/packages/kubernetes/windowszip:v$($global:KubeBinariesVersion)" | ||
| $maxRetries = 5 |
There was a problem hiding this comment.
can we share same retry with DownloadFileOverHttp to minimize the code change?
There was a problem hiding this comment.
for example put if bootstrapProfileRegistry at line 225
| $Platform = "windows/amd64" | ||
| ) | ||
|
|
||
| Write-Log "Downloading $Reference to $DestinationPath via oras pull (platform=$Platform)" |
There was a problem hiding this comment.
consider use cache before download?
There was a problem hiding this comment.
cached packages are in $CacheDir
There was a problem hiding this comment.
or we can leave to future pr
What this PR does / why we need it:
/kind feature
Which issue(s) this PR fixes:
Download kubelet with oras in network isolated windows cluster.