From 099c1baa1eecbea6b7c7057cbc3212f0d095c1d9 Mon Sep 17 00:00:00 2001 From: Eric Hibbs Date: Mon, 8 Jun 2026 13:45:34 -0700 Subject: [PATCH 1/2] Add draft EKS CloudFormation templates Greenfield wrapper (eks-cluster.yaml): VPC + EKS + node group + OIDC. Base (firewall-eks.yaml): ElastiCache + Socket token in Secrets Manager + IRSA role, emitting the helm install command. Plus an example DNS-override Helm values file. Statically validated (cfn-lint, helm template, kubeconform); not deploy-tested. --- cloudformation/README.md | 41 +++- cloudformation/eks-cluster.yaml | 208 ++++++++++++++++++ cloudformation/firewall-eks.yaml | 173 +++++++++++++++ .../values/dns-override.values.yaml | 84 +++++++ 4 files changed, 502 insertions(+), 4 deletions(-) create mode 100644 cloudformation/eks-cluster.yaml create mode 100644 cloudformation/firewall-eks.yaml create mode 100644 cloudformation/values/dns-override.values.yaml diff --git a/cloudformation/README.md b/cloudformation/README.md index 8743f96..61cd3cc 100644 --- a/cloudformation/README.md +++ b/cloudformation/README.md @@ -1,6 +1,39 @@ -# CloudFormation templates +# CloudFormation — Socket Registry Firewall on AWS (EKS) -AWS CloudFormation deployment templates for the Socket Registry Firewall. +> **Status: DRAFT, untested.** These templates are a work in progress and have not been deployed yet. Validate in a non-production account before relying on them. -Status: in progress. An EKS template for the Vercel internal deployment is being scoped -(firewall + Redis, DNS-override routing). Pending requirements confirmation before authoring. +## Background + +Reference CloudFormation for deploying the Socket Registry Firewall on **AWS EKS** (firewall + Redis, DNS-override routing, self-signed certs, fail-open). This is the EKS companion to the existing per-platform deployment templates (ECS Fargate, GCP Cloud Run, Azure Container Apps). + +On EKS the firewall is the **Helm chart** (`../helm`), which already supports DNS-override, self-signed CA generation, and Redis/TLS — so most of the work is standing up the AWS infrastructure and installing the chart, not re-implementing the workload. + +## Layout + +| File | Purpose | +|------|---------| +| `eks-cluster.yaml` | **Greenfield wrapper** — VPC + EKS cluster + node group + OIDC provider. Skip if you already run a cluster. | +| `firewall-eks.yaml` | **Shared base** — ElastiCache Redis + Socket token (Secrets Manager) + IRSA role; emits the `helm upgrade --install` command. | +| `values/dns-override.values.yaml` | Example Helm values (DNS-override + Redis + self-signed certs). | + +## Cases + +- **Greenfield (no cluster):** deploy `eks-cluster.yaml`, then `firewall-eks.yaml` (wire its outputs in), then run the emitted Helm command. +- **Existing cluster:** deploy `firewall-eks.yaml` only (pass your existing `ClusterName` / `VpcId` / subnets / cluster security group / OIDC), then Helm. + +The base is identical in both cases — the wrapper only adds the cluster/VPC layer. + +## Helm install: handoff vs one-shot + +The base template currently takes the **handoff** approach: CloudFormation provisions the infrastructure and outputs the exact `helm upgrade --install` command to run. The alternative is **one-shot**, where the stack runs Helm itself via a CodeBuild-backed custom resource (CodeBuild installs helm/kubectl, runs `aws eks update-kubeconfig`, then `helm upgrade --install`; a small Lambda signals CloudFormation). One-shot is a single-deploy experience but adds a moving part and requires the CodeBuild role be added to the cluster's EKS access entries. + +## Config model + +On EKS the Helm chart renders the firewall config into a **ConfigMap**. The stack only injects two install-time values: the ElastiCache endpoint (`redis.host`) and the Socket token (`socket.apiToken`, read from Secrets Manager). + +## Known DRAFT caveats + +- **Untested** — no template here has been deployed yet. +- `eks-cluster.yaml`'s OIDC `ThumbprintList` is a placeholder — confirm it for your issuer, or associate the provider with `eksctl utils associate-iam-oidc-provider`. +- ElastiCache transit encryption assumes the firewall image trusts Amazon's CA for `redis.ssl`; if not, mount the CA or set `redis.sslVerify: false`. +- The example values file enables npm + pypi only — adjust to your ecosystems. diff --git a/cloudformation/eks-cluster.yaml b/cloudformation/eks-cluster.yaml new file mode 100644 index 0000000..a8d915f --- /dev/null +++ b/cloudformation/eks-cluster.yaml @@ -0,0 +1,208 @@ +AWSTemplateFormatVersion: "2010-09-09" +Description: > + DRAFT — Greenfield EKS wrapper for the Socket Registry Firewall. Stands up a + VPC (2 public + 2 private subnets, NAT), an EKS cluster, a managed node group, + and the IAM OIDC provider for IRSA. Outputs feed firewall-eks.yaml. + Skip this template entirely if you already run an EKS cluster. + UNTESTED DRAFT — validate in a non-production account (see README.md). + +Parameters: + ClusterName: + Type: String + Default: socket-firewall + KubernetesVersion: + Type: String + Default: "1.30" + NodeInstanceType: + Type: String + Default: t3.large + DesiredNodes: + Type: Number + Default: 2 + MaxNodes: + Type: Number + Default: 6 + VpcCidr: + Type: String + Default: 10.0.0.0/16 + +Mappings: + Subnets: + Public0: { Cidr: 10.0.0.0/20 } + Public1: { Cidr: 10.0.16.0/20 } + Private0: { Cidr: 10.0.32.0/20 } + Private1: { Cidr: 10.0.48.0/20 } + +Resources: + # --- VPC ------------------------------------------------------------------ + Vpc: + Type: AWS::EC2::VPC + Properties: + CidrBlock: !Ref VpcCidr + EnableDnsHostnames: true + EnableDnsSupport: true + Tags: [{ Key: Name, Value: !Sub "${ClusterName}-vpc" }] + + InternetGateway: + Type: AWS::EC2::InternetGateway + VpcGatewayAttachment: + Type: AWS::EC2::VPCGatewayAttachment + Properties: { VpcId: !Ref Vpc, InternetGatewayId: !Ref InternetGateway } + + PublicSubnet0: + Type: AWS::EC2::Subnet + Properties: + VpcId: !Ref Vpc + CidrBlock: !FindInMap [Subnets, Public0, Cidr] + AvailabilityZone: !Select [0, !GetAZs ""] + MapPublicIpOnLaunch: true + Tags: + - { Key: Name, Value: !Sub "${ClusterName}-public-0" } + - { Key: kubernetes.io/role/elb, Value: "1" } + PublicSubnet1: + Type: AWS::EC2::Subnet + Properties: + VpcId: !Ref Vpc + CidrBlock: !FindInMap [Subnets, Public1, Cidr] + AvailabilityZone: !Select [1, !GetAZs ""] + MapPublicIpOnLaunch: true + Tags: + - { Key: Name, Value: !Sub "${ClusterName}-public-1" } + - { Key: kubernetes.io/role/elb, Value: "1" } + PrivateSubnet0: + Type: AWS::EC2::Subnet + Properties: + VpcId: !Ref Vpc + CidrBlock: !FindInMap [Subnets, Private0, Cidr] + AvailabilityZone: !Select [0, !GetAZs ""] + Tags: + - { Key: Name, Value: !Sub "${ClusterName}-private-0" } + - { Key: kubernetes.io/role/internal-elb, Value: "1" } + PrivateSubnet1: + Type: AWS::EC2::Subnet + Properties: + VpcId: !Ref Vpc + CidrBlock: !FindInMap [Subnets, Private1, Cidr] + AvailabilityZone: !Select [1, !GetAZs ""] + Tags: + - { Key: Name, Value: !Sub "${ClusterName}-private-1" } + - { Key: kubernetes.io/role/internal-elb, Value: "1" } + + NatEip: + Type: AWS::EC2::EIP + Properties: { Domain: vpc } + NatGateway: + Type: AWS::EC2::NatGateway + Properties: + AllocationId: !GetAtt NatEip.AllocationId + SubnetId: !Ref PublicSubnet0 + + PublicRouteTable: + Type: AWS::EC2::RouteTable + Properties: { VpcId: !Ref Vpc } + PublicRoute: + Type: AWS::EC2::Route + DependsOn: VpcGatewayAttachment + Properties: + RouteTableId: !Ref PublicRouteTable + DestinationCidrBlock: 0.0.0.0/0 + GatewayId: !Ref InternetGateway + PublicAssoc0: + Type: AWS::EC2::SubnetRouteTableAssociation + Properties: { RouteTableId: !Ref PublicRouteTable, SubnetId: !Ref PublicSubnet0 } + PublicAssoc1: + Type: AWS::EC2::SubnetRouteTableAssociation + Properties: { RouteTableId: !Ref PublicRouteTable, SubnetId: !Ref PublicSubnet1 } + + PrivateRouteTable: + Type: AWS::EC2::RouteTable + Properties: { VpcId: !Ref Vpc } + PrivateRoute: + Type: AWS::EC2::Route + Properties: + RouteTableId: !Ref PrivateRouteTable + DestinationCidrBlock: 0.0.0.0/0 + NatGatewayId: !Ref NatGateway + PrivateAssoc0: + Type: AWS::EC2::SubnetRouteTableAssociation + Properties: { RouteTableId: !Ref PrivateRouteTable, SubnetId: !Ref PrivateSubnet0 } + PrivateAssoc1: + Type: AWS::EC2::SubnetRouteTableAssociation + Properties: { RouteTableId: !Ref PrivateRouteTable, SubnetId: !Ref PrivateSubnet1 } + + # --- EKS cluster ---------------------------------------------------------- + ClusterRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: { Service: eks.amazonaws.com } + Action: sts:AssumeRole + ManagedPolicyArns: + - arn:aws:iam::aws:policy/AmazonEKSClusterPolicy + + Cluster: + Type: AWS::EKS::Cluster + Properties: + Name: !Ref ClusterName + Version: !Ref KubernetesVersion + RoleArn: !GetAtt ClusterRole.Arn + ResourcesVpcConfig: + SubnetIds: [!Ref PrivateSubnet0, !Ref PrivateSubnet1, !Ref PublicSubnet0, !Ref PublicSubnet1] + EndpointPublicAccess: true + EndpointPrivateAccess: true + + # IAM OIDC provider for IRSA. NOTE: verify the thumbprint for the region/issuer + # (or associate via `eksctl utils associate-iam-oidc-provider`). The value below + # is the commonly-used Amazon root CA thumbprint and should be confirmed. + OidcProvider: + Type: AWS::IAM::OIDCProvider + Properties: + Url: !GetAtt Cluster.OpenIdConnectIssuerUrl + ClientIdList: ["sts.amazonaws.com"] + ThumbprintList: ["9e99a48a9960b14926bb7f3b02e22da2b0ab7280"] # CONFIRM + + # --- Managed node group --------------------------------------------------- + NodeRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: { Service: ec2.amazonaws.com } + Action: sts:AssumeRole + ManagedPolicyArns: + - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy + - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy + - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly + + NodeGroup: + Type: AWS::EKS::Nodegroup + Properties: + ClusterName: !Ref Cluster + NodeRole: !GetAtt NodeRole.Arn + Subnets: [!Ref PrivateSubnet0, !Ref PrivateSubnet1] + InstanceTypes: [!Ref NodeInstanceType] + ScalingConfig: + MinSize: !Ref DesiredNodes + DesiredSize: !Ref DesiredNodes + MaxSize: !Ref MaxNodes + +Outputs: + ClusterName: + Value: !Ref Cluster + VpcId: + Value: !Ref Vpc + PrivateSubnetIds: + Value: !Join [",", [!Ref PrivateSubnet0, !Ref PrivateSubnet1]] + ClusterSecurityGroupId: + Description: Cluster security group (use as ClusterSecurityGroupId in firewall-eks.yaml). + Value: !GetAtt Cluster.ClusterSecurityGroupId + OIDCProviderArn: + Value: !Ref OidcProvider + OIDCProviderUrl: + Description: Issuer URL without the https:// prefix (strip it before passing to firewall-eks.yaml). + Value: !GetAtt Cluster.OpenIdConnectIssuerUrl diff --git a/cloudformation/firewall-eks.yaml b/cloudformation/firewall-eks.yaml new file mode 100644 index 0000000..03cc930 --- /dev/null +++ b/cloudformation/firewall-eks.yaml @@ -0,0 +1,173 @@ +AWSTemplateFormatVersion: "2010-09-09" +Description: > + DRAFT — Socket Registry Firewall on EKS: shared base. Provisions the AWS + infrastructure the firewall needs (ElastiCache Redis, Socket API token in + Secrets Manager, IRSA role) and emits the exact `helm upgrade --install` + command to deploy the chart. Assumes an existing EKS cluster; pair with + eks-cluster.yaml for a greenfield cluster. + UNTESTED DRAFT — validate in a non-production account (see cloudformation/README.md). + +Parameters: + ClusterName: + Type: String + Description: Name of the existing EKS cluster to deploy into. + + VpcId: + Type: AWS::EC2::VPC::Id + Description: VPC of the EKS cluster. + + PrivateSubnetIds: + Type: List + Description: Two or more private subnets for ElastiCache (same VPC as the cluster). + + ClusterSecurityGroupId: + Type: AWS::EC2::SecurityGroup::Id + Description: The EKS cluster security group that node/pod traffic originates from (allowed into Redis). + + OIDCProviderArn: + Type: String + Description: IAM OIDC provider ARN of the cluster (for IRSA). From eks-cluster.yaml or `aws eks describe-cluster`. + + OIDCProviderUrl: + Type: String + Description: IAM OIDC provider URL WITHOUT https:// (e.g. oidc.eks.us-east-1.amazonaws.com/id/ABCDEF...). + + Namespace: + Type: String + Default: socket-firewall + Description: Kubernetes namespace the firewall is installed into. + + ServiceAccountName: + Type: String + Default: socket-firewall + Description: Service account name the chart uses (for the IRSA trust policy). + + SocketApiToken: + Type: String + NoEcho: true + Description: Socket.dev API token (scopes packages, entitlements:list). + + RedisNodeType: + Type: String + Default: cache.t4g.small + Description: ElastiCache node type. + + ChartVersion: + Type: String + Default: "0.2.4" + Description: Socket Firewall Helm chart version to install. + +Resources: + + # --- Socket API token ----------------------------------------------------- + SocketApiTokenSecret: + Type: AWS::SecretsManager::Secret + Properties: + Name: !Sub "${AWS::StackName}/socket-api-token" + SecretString: !Ref SocketApiToken + + # --- Redis (ElastiCache) -------------------------------------------------- + RedisSecurityGroup: + Type: AWS::EC2::SecurityGroup + Properties: + GroupDescription: Socket Firewall Redis + VpcId: !Ref VpcId + SecurityGroupIngress: + - IpProtocol: tcp + FromPort: 6379 + ToPort: 6379 + SourceSecurityGroupId: !Ref ClusterSecurityGroupId + Description: Redis from EKS pods (cluster security group) + Tags: + - Key: Name + Value: !Sub "${AWS::StackName}-redis" + + RedisSubnetGroup: + Type: AWS::ElastiCache::SubnetGroup + Properties: + Description: !Sub "${AWS::StackName} Redis subnets" + SubnetIds: !Ref PrivateSubnetIds + + RedisReplicationGroup: + Type: AWS::ElastiCache::ReplicationGroup + Properties: + ReplicationGroupDescription: Socket Firewall cache + Engine: redis + CacheNodeType: !Ref RedisNodeType + NumCacheClusters: 2 + AutomaticFailoverEnabled: true + MultiAZEnabled: true + CacheSubnetGroupName: !Ref RedisSubnetGroup + SecurityGroupIds: + - !Ref RedisSecurityGroup + AtRestEncryptionEnabled: true + TransitEncryptionEnabled: true # firewall connects with redis.ssl=true + Port: 6379 + Tags: + - Key: Name + Value: !Sub "${AWS::StackName}-redis" + + # --- IRSA role for the firewall service account --------------------------- + # Lets the firewall pod read its Socket token from Secrets Manager (so the + # token is not baked into a K8s manifest). The chart can also take the token + # directly via --set; this role is here for the Secrets-Manager-backed path. + FirewallIrsaRole: + Type: AWS::IAM::Role + Properties: + RoleName: !Sub "${AWS::StackName}-firewall-irsa" + # Expressed as a !Sub JSON string: the condition keys are dynamic + # (`:sub`/`:aud`) and CloudFormation can't template a YAML map key. + AssumeRolePolicyDocument: !Sub | + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { "Federated": "${OIDCProviderArn}" }, + "Action": "sts:AssumeRoleWithWebIdentity", + "Condition": { + "StringEquals": { + "${OIDCProviderUrl}:sub": "system:serviceaccount:${Namespace}:${ServiceAccountName}", + "${OIDCProviderUrl}:aud": "sts.amazonaws.com" + } + } + } + ] + } + Policies: + - PolicyName: read-socket-token + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: ["secretsmanager:GetSecretValue"] + Resource: !Ref SocketApiTokenSecret + +Outputs: + RedisPrimaryEndpoint: + Description: ElastiCache primary endpoint (feed into helm redis.host). + Value: !GetAtt RedisReplicationGroup.PrimaryEndPoint.Address + + SocketApiTokenSecretArn: + Description: Secrets Manager ARN holding the Socket API token. + Value: !Ref SocketApiTokenSecret + + FirewallIrsaRoleArn: + Description: IRSA role ARN to annotate on the firewall service account. + Value: !GetAtt FirewallIrsaRole.Arn + + HelmInstallCommand: + Description: > + Run this after the stack completes to deploy the firewall. (If we instead + want the stack itself to run Helm, see the one-shot CodeBuild custom-resource + option in README.md.) + Value: !Sub | + aws eks update-kubeconfig --name ${ClusterName} --region ${AWS::Region} && + helm repo add socket-firewall https://socketdev-demo.github.io/socket-firewall-helm && + helm repo update && + helm upgrade --install socket-firewall socket-firewall/socket-firewall + --version ${ChartVersion} + --namespace ${Namespace} --create-namespace + -f cloudformation/values/dns-override.values.yaml + --set redis.host=${RedisReplicationGroup.PrimaryEndPoint.Address} + --set socket.apiToken="$(aws secretsmanager get-secret-value --secret-id ${SocketApiTokenSecret} --query SecretString --output text)" diff --git a/cloudformation/values/dns-override.values.yaml b/cloudformation/values/dns-override.values.yaml new file mode 100644 index 0000000..e6f8930 --- /dev/null +++ b/cloudformation/values/dns-override.values.yaml @@ -0,0 +1,84 @@ +# Socket Firewall — Helm values: DNS-override deployment (example) +# +# DNS-override routing: point internal DNS for the registry hostnames at the +# firewall, and the firewall presents self-signed certs for those hostnames. +# - DNS-override routing +# - Self-signed certs at the endpoints (distribute the CA to clients) +# - Redis caching enabled (backed by the ElastiCache stack) +# - Fail-open if Socket's decision service is unavailable +# +# Adjust the ecosystem list and Redis TLS settings to your environment. +# Chart: ../helm + +replicaCount: 2 + +image: + repository: socketdev/socket-registry-firewall + tag: "1.1.159" # pin; do not use :latest for a security product + +socket: + # Injected at install time from AWS Secrets Manager by the CloudFormation + # stack (helm --set socket.apiToken=...). Do not commit the token. + apiToken: "" + failOpen: true # fail open if the Socket API is unavailable + failOpenUnscanned: true + cacheTtl: 600 + +# DNS-override mode: the firewall presents certs for the real registry hostnames, +# and internal DNS for those hostnames points at the firewall's load balancer. +dnsRouting: + enabled: true + registries: + - npm + - pypi + # add as needed: maven, cargo, rubygems, openvsx, nuget, go, conda + +# Self-signed CA generated by the chart's init container; SANs cover the +# dnsRouting hostnames automatically. Distribute the CA to client endpoints. +tls: + generateSelfSigned: true + +# Redis caching backed by the ElastiCache replication group provisioned by +# the CloudFormation stack. Host is injected at install time (--set redis.host=...). +redis: + enabled: true + host: "" # set to the ElastiCache primary endpoint by the stack + port: 6379 + ssl: true # ElastiCache transit encryption + sslVerify: true # if the image doesn't trust Amazon's CA, mount it or set false + ttl: 86400 + +# Expose via an internal load balancer so internal DNS can target it. +service: + type: LoadBalancer + annotations: + service.beta.kubernetes.io/aws-load-balancer-internal: "true" + service.beta.kubernetes.io/aws-load-balancer-type: "nlb" + +autoscaling: + enabled: true + minReplicas: 2 + maxReplicas: 10 + targetCPUUtilizationPercentage: 70 + +podDisruptionBudget: + enabled: true + minAvailable: 1 + +resources: + limits: + cpu: "2" + memory: 1Gi + requests: + cpu: "1" + memory: 512Mi + +affinity: + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + labelSelector: + matchLabels: + app.kubernetes.io/name: socket-firewall + topologyKey: kubernetes.io/hostname From f7b9e9e29f65be698d625b0995e2a1e8da14a6b3 Mon Sep 17 00:00:00 2001 From: Eric Hibbs Date: Tue, 9 Jun 2026 09:53:04 -0700 Subject: [PATCH 2/2] Associate the IAM OIDC provider via eksctl instead of a placeholder thumbprint Hardcoding an OIDC thumbprint is brittle and would fail a real deploy. Remove the in-template AWS::IAM::OIDCProvider and document associating it with eksctl utils associate-iam-oidc-provider (which fetches the correct thumbprint) after the cluster is created. --- cloudformation/eks-cluster.yaml | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/cloudformation/eks-cluster.yaml b/cloudformation/eks-cluster.yaml index a8d915f..6e62b01 100644 --- a/cloudformation/eks-cluster.yaml +++ b/cloudformation/eks-cluster.yaml @@ -154,15 +154,11 @@ Resources: EndpointPublicAccess: true EndpointPrivateAccess: true - # IAM OIDC provider for IRSA. NOTE: verify the thumbprint for the region/issuer - # (or associate via `eksctl utils associate-iam-oidc-provider`). The value below - # is the commonly-used Amazon root CA thumbprint and should be confirmed. - OidcProvider: - Type: AWS::IAM::OIDCProvider - Properties: - Url: !GetAtt Cluster.OpenIdConnectIssuerUrl - ClientIdList: ["sts.amazonaws.com"] - ThumbprintList: ["9e99a48a9960b14926bb7f3b02e22da2b0ab7280"] # CONFIRM + # IAM OIDC provider for IRSA is intentionally NOT created here — hardcoding an + # OIDC thumbprint is brittle. After the cluster is up, associate the provider + # (which fetches the correct thumbprint automatically) with: + # eksctl utils associate-iam-oidc-provider --cluster --approve + # Then pass that provider's ARN and the OIDCProviderUrl output into firewall-eks.yaml. # --- Managed node group --------------------------------------------------- NodeRole: @@ -201,8 +197,9 @@ Outputs: ClusterSecurityGroupId: Description: Cluster security group (use as ClusterSecurityGroupId in firewall-eks.yaml). Value: !GetAtt Cluster.ClusterSecurityGroupId - OIDCProviderArn: - Value: !Ref OidcProvider OIDCProviderUrl: - Description: Issuer URL without the https:// prefix (strip it before passing to firewall-eks.yaml). + Description: > + Cluster OIDC issuer URL. Associate the IAM OIDC provider via + `eksctl utils associate-iam-oidc-provider`, then pass that provider's ARN + and this URL (without the https:// prefix) into firewall-eks.yaml. Value: !GetAtt Cluster.OpenIdConnectIssuerUrl