What were you trying to accomplish?
When create new cluster with version 1.36 and managed nodegroup, it fail to register nodes due to missing IAM permissions.
What happened?
The EKS node was NotReady because the aws-node (VPC CNI) pod was crash-looping due to missing IAM permissions on the IRSA role eksctl-my-cluster-addon-vpc-cni-Role1-q8lpDXP2stGw.
The role's inline policy was missing two EC2 actions:
ec2:DescribeSubnets — needed because ENABLE_SUBNET_DISCOVERY=true (default in VPC CNI v1.22.1)
ec2:DescribeSecurityGroups — needed for the CNI's discoverCustomSecurityGroups() call during initialization, which treated the 403 as a fatal error
Without the CNI running, the node reported NetworkPluginNotReady: cni plugin not initialized and stayed in NotReady state.
Error message and places:
-
Pod events (kubectl describe pod aws-node-k4qf2):
Warning MissingIAMPermissions Unauthorized operation: failed to call ec2:DescribeSubnets due to missing permissions.
-
IPAMD log file on the node (/var/log/aws-routed-eni/ipamd.log):
"Initialization failure: discoverCustomSecurityGroups: unable to describe security groups: operation error EC2:
DescribeSecurityGroups, https response error StatusCode: 403, api error UnauthorizedOperation: You are not authorized to perform this operation... no identity-based policy allows the ec2:DescribeSecurityGroups action"
The first error (DescribeSubnets) appeared as a Kubernetes event visible in kubectl describe pod. The second fatal
error (DescribeSecurityGroups) only appeared in the IPAMD log file — the container stdout just showed "Checking for IPAM connectivity..." and appeared to hang, because the actual crash error wasn't printed to stdout/stderr.
How to reproduce it?
Crete the file below.
File: ipv6-cluster.yaml
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-cluster
region: sa-east-1
version: "1.36"
tags:
GuardDutyManaged: "false"
kubernetesNetworkConfig:
ipFamily: IPv6
vpc:
clusterEndpoints:
publicAccess: true
privateAccess: true
addons:
- name: vpc-cni
version: latest
- name: coredns
version: latest
- name: kube-proxy
version: latest
iam:
withOIDC: true
managedNodeGroups:
- name: ipv6-mng
instanceType: t3.medium
desiredCapacity: 1
privateNetworking: true
tags:
nodegroup-name: my-nodegroup
GuardDutyManaged: "false"
cloudWatch:
clusterLogging:
enableTypes: ["*"]
logRetentionInDays: 7
Run the command will fail:
eksctl create cluster -f ipv6-cluster.yaml --cfn-disable-rollback
Update the template file with the content below:
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-cluster
region: sa-east-1
version: "1.36"
tags:
GuardDutyManaged: "false"
kubernetesNetworkConfig:
ipFamily: IPv6
vpc:
clusterEndpoints:
publicAccess: true
privateAccess: true
addons:
- name: vpc-cni
version: latest
attachPolicy:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- ec2:AssignIpv6Addresses
- ec2:DescribeInstances
- ec2:DescribeTags
- ec2:DescribeNetworkInterfaces
- ec2:DescribeInstanceTypes
- ec2:DescribeSubnets
- ec2:DescribeSecurityGroups
- ec2:CreateNetworkInterface
- ec2:AttachNetworkInterface
- ec2:DeleteNetworkInterface
- ec2:DetachNetworkInterface
- ec2:ModifyNetworkInterfaceAttribute
- ec2:UnassignPrivateIpAddresses
- ec2:UnassignIpv6Addresses
- ec2:AssignPrivateIpAddresses
Resource: "*"
- Effect: Allow
Action:
- ec2:CreateTags
Resource: "arn:aws:ec2:*:*:network-interface/*"
- name: coredns
version: latest
- name: kube-proxy
version: latest
iam:
withOIDC: true
managedNodeGroups:
- name: ipv6-mng
instanceType: t3.medium
desiredCapacity: 1
privateNetworking: true
tags:
nodegroup-name: my-nodegroup
GuardDutyManaged: "false"
cloudWatch:
clusterLogging:
enableTypes: ["*"]
logRetentionInDays: 7
Run the command, now it works:
eksctl create cluster -f ipv6-cluster.yaml --cfn-disable-rollback
Logs
Anything else we need to know?
Versions
13:18 $ eksctl info
eksctl version: 0.227.0
kubectl version: v1.36.1
OS: linux
What were you trying to accomplish?
When create new cluster with version 1.36 and managed nodegroup, it fail to register nodes due to missing IAM permissions.
What happened?
The EKS node was NotReady because the aws-node (VPC CNI) pod was crash-looping due to missing IAM permissions on the IRSA role
eksctl-my-cluster-addon-vpc-cni-Role1-q8lpDXP2stGw.The role's inline policy was missing two EC2 actions:
ec2:DescribeSubnets— needed because ENABLE_SUBNET_DISCOVERY=true (default in VPC CNI v1.22.1)ec2:DescribeSecurityGroups— needed for the CNI's discoverCustomSecurityGroups() call during initialization, which treated the 403 as a fatal errorWithout the CNI running, the node reported NetworkPluginNotReady: cni plugin not initialized and stayed in NotReady state.
Error message and places:
Pod events (kubectl describe pod aws-node-k4qf2):
Warning MissingIAMPermissions Unauthorized operation: failed to call ec2:DescribeSubnets due to missing permissions.
IPAMD log file on the node (/var/log/aws-routed-eni/ipamd.log):
"Initialization failure: discoverCustomSecurityGroups: unable to describe security groups: operation error EC2:
DescribeSecurityGroups, https response error StatusCode: 403, api error UnauthorizedOperation: You are not authorized to perform this operation... no identity-based policy allows the ec2:DescribeSecurityGroups action"
The first error (DescribeSubnets) appeared as a Kubernetes event visible in kubectl describe pod. The second fatal
error (DescribeSecurityGroups) only appeared in the IPAMD log file — the container stdout just showed "Checking for IPAM connectivity..." and appeared to hang, because the actual crash error wasn't printed to stdout/stderr.
How to reproduce it?
Crete the file below.
File:
ipv6-cluster.yamlRun the command will fail:
Update the template file with the content below:
Run the command, now it works:
Logs
Anything else we need to know?
Versions